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Abstract 

Our visual system is a remarkably flexible and reliable source of information 
about the world. One of the major challenges for appreciating how vision contributes 
to our understanding of the world is understanding how it copes with the wide 
variety of lighting conditions, surfaces, and surface markings to provide accurate 
representations of the surfaces around us. The goal of the research reported here is 
to gain a better theoretical understanding of what lies behind the visual system's 
ability to generate robust surface interpretations from single grey scale images of 
smooth surfaces. In the course of doing this, a new robust shape from shading 
method is developed. 

The image irradiance equation is written using coordinate independent notation 
and concepts from modern differential geometry and global analysis. This is done 
to help make explicit the assumptions about the image formation process, and to 
delay making these assumptions as long as possible. The method of characteristic 
strips used by Horn (Horn, 1975) can be interpreted as a dynamical system on the 
five-dimensional space of tangent planes, C(IR 3 ,2). Modern methods for analyzing 
the behavior of dynamical systems are used to show that solution surfaces for the 
shape from shading problem are invariant manifolds of the flow generated by the 
image dynamical system. The rest of the analysis assumes orthographic projection 
of the image and a space-invariant reflectance function, but does not assume any 
particular form or symmetry for the reflectance function. 

Near critical points in the image dynamical system due to certain critical points in 
a smooth image, in general (i.e. in the absence of special symmetries) the dynamical 
system approach implies there will only be four possible smooth solution surfaces for 
the shape from shading problem. Two of these are the stable and unstable manifolds 
associated with the image dynamical system critical point. Two implementations 
for finding the unstable (or stable) manifold in this dynamical system are developed 
using the image dynamical system directly. 

The shading information in a patch containing a piece of the bounding contour 
is also examined, and it appears to contribute more to an assessment of a reflectance 
function choice than to the determination of patches of solution surfaces consistent 
with the image. 

Finally directions for future work are suggested, and some guidelines and caveats 
are provided for the development of image analysis systems based on these ideas. 
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Chapter 1 

Introduction and Background 



Our visual system is a remarkably flexible and reliable source of information 
about the world. A major challenge is understanding how vision copes with the wide 
variety of lighting conditions, surfaces, and surface markings to provide accurate 
representations of the surfaces around us. 

The goal of the research reported here is to gain a better theoretical understand- 
ing of the visual system's ability to generate robust surface interpretations from 
single grey scale images. In the course of doing this, we also suggest a new type 
of shape from shading method. We will write the image irradiance equation using 
notation and concepts from modern differential geometry and global analysis. The 
method of characteristic strips used by Horn (Horn, 1975) can be interpreted as 
a dynamical system on the five-dimensional space of tangent planes, C(JR ,2). We 
will make use of modern methods for analyzing the behavior of dynamical systems. 
Some of these tools have already been brought to bear on knotty research areas such 
as quantum mechanics, relativity, and cosmology (Hawking and Ellis, 1973, Edelen, 
1985, Abraham and Marsden, 1985). We show that solution surfaces for the shape 
from shading problem are invariant manifolds of the flow generated by the image dy- 
namical system, and that the stable and unstable manifolds associated with certain 
critical points in the image play an important role in determining solution surfaces. 
We will show two implementations for finding the unstable (or stable) manifolds in 
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this dynamical system using the image dynamical system directly. We also analyze 
the shading information in a patch containing a piece of the bounding contour, and 
conclude that it contributes more to an assessment of a reflectance function choice 
than to the determination of solution surfaces consistent with the correct reflectance 
function and image. Finally we suggest directions for future work, and provide some 
guidelines and caveats for the development of image analysis systems based on these 
ideas. 

1.1 Introduction 

Although there are a variety of sources of information available to the human 
visual system including stereo, color, motion, and shading, we get very clear im- 
pressions of the three-dimensional character of a scene from a single grey-level still 
picture, even of scenes or objects that are not recognized as previously having been 
seen. This suggests that there is enough information in monocular grey-level images 
without motion for the visual system to arrive at a three-dimensional interpretation 
that is very convincing. The visual system is not always correct or even unambiguous 
in its interpretations: a picture can be interpreted "correctly" as a flat surface with 
shading variations or a clear window onto a scene; much of the cosmetic industry is 
dedicated to shape "enhancement" through shading. 

Operationally, we get information from our visual system about positions and 
surfaces allowing us to navigate and interact with the environment. We are also able 
to combine the sense of touch with images of an object, allowing us to recognize 
a handled object blindfolded even if we have previously only seen pictures of it. 
This suggests some common information structure concerning shapes and surfaces 
accessible by touch, vision, and coordination. It is not clear, however, how accurately 
the visual system estimates shape considered as the exact location in space of each 
point on a surface or as the exact orientation of the surface at each point. It is 
difficult to create psychophysical experiments to test this because it is difficult to 
quantitatively probe a subject's internal information about surface shape. There 



Chapter 1 Introduction and Background 

have been a few experiments along these lines recently (Mingolla and Todd, 1986, 
Bulthoff and Mallot, 1987) which suggest that our impressions of surface orientation 
are far more qualitative than we might like to believe. Nonetheless, we are interested 
in how much information about shape is contained in an image of a surface; answers 
to questions like these at least provide upper bounds on what the visual system can 
do in the absence of other information or assumptions. 

One approach to studying what we get out of an image through vision is to try to 
understand what it is possible to discover from an image under different assumptions. 
Without any assumptions about lighting conditions or surface properties, there is no 
hope for recovering any information about surfaces in space: one can take a nearly 
arbitrary smooth surface and paint it to give the same image as another smooth 
surface without paint. The human visual system is apparently capable of using more 
than one set of assumptions: consider again the dichotomy between a picture as 
shaded surface and as window onto a scene. At the same time, it is difficult for us to 
entertain a continuum of possible interpretations: without extra cues, it is hard for us 
to interpret a flat picture of a scene as a different, curved, carefully painted surface. 
Some of Escher's delightfully bewildering works take advantage of such confusions in 
the visual system (Figure 1.1). 

1.2 Background 

Berthold Horn's early work on the shape from shading problem (Horn, 1975) 
examined it as a problem of physics, looking at the process of image formation and 
how light is reflected from objects and concentrated to form images. He defined 
a summary function, the reflectance function, that contained all the relevant local 
information about lighting conditions and surface reflecting properties under the as- 
sumption that reflecting properties of a surface patch were dependent solely on the 
orientation of the surface, and were constant with rotations of the surface around 
its normal. With additional assumptions of no cast shadows and no mutual illumi- 
nations, the brightness of a point in an image depends on the location of the point 
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Figure 1.1. "Drawing Hands" by M. C. Escher (lithograph, 1948). 



in space, the orientation of the surface in space at that point, and the reflectance 
function for the lighting conditions and surface material. 

Assuming a known reflectance function, Horn was able to characterize the shape 
from shading problem as a nonlinear first order partial differential equation, the 
image irradiance equation. Classically, such an equation is solved by the method 
of characteristics, and Horn used this technique to develop a method for solving 
for the surface given some initial curve lying on the surface with known surface 
normals. Essentially, this is a Cauchy problem, and the solution proceeds along 
adjacent characteristic curves beginning at the known curve of data. These curves 
together with the surface orientations for the solution surface along them are known 
as characteristic strips. 
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Horn used tools from partial differential equations: calculations were generally 
done in a particular coordinate system, and all derivatives were partial derivatives 
with respect to the coordinates of that system. Horn assumed orthographic projec- 
tion onto the image surface, and used orthogonal coordinates x, y, z with z-axis in 
the direction of assumed parallel projection. He used "gradient space" to represent 
surface normal directions: the normal to the surface z = f(x, y) in the (x, y, z) coor- 
dinate system is given by (p, q, —1), where p = f x , and q = f y . Unfortunately, these 
coordinates for the surface normal become unbounded when the visual boundary, or 
bounding contour, of an object is reached, where the tangent plane of the surface 
becomes parallel to the projection direction. This was later handled by using a differ- 
ent coordinate system based on stereographic projection of the unit sphere (Ikeuchi 
and Horn, 1981) at the cost of increased complexity of the coordinate expressions. 

Horn recognized the importance of the critical points in the image. Given the 
reflectance function, these could be taken as points in the image with known orienta- 
tion. As he noted, the characteristic trajectories could not be used to draw out the 
surface from the critical point, so he constructed a spherical cap consistent with the 
critical point orientation and used a small closed contour on this cap as his initial 
condition curve. 

Direct integration of the characteristic equations to find the characteristic strips 
suffers from noise sensitivity in practical implementations. As the solution proceeds 
along constructed curves from the initial condition curve, these curves can deviate 
as a result of quantization error and other noise influences. Horn and Brooks (Horn 
and Brooks, 1986) review and compare a number of related methods by different 
researchers for making the solution of the shape from shading problem a global one, 
allowing data from the full image to contribute to finding stable, relatively robust 
solution surfaces. They provide a recipe for generating shape from shading methods. 
These are relaxation methods based on minimizing a certain measure, typically the 
integral over the image region of some combination of the error in the image irradiance 
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equation and a penalty term for departure from smoothness. In some cases, these 
methods have been derived using gradient space coordinates; in others, stereographic 
coordinates. Horn and Brooks derive methods based on direct calculations with the 
unit normal vector to the surface as well. 

Various smoothness, integrability, or regularization terms have been tried by 
various researchers. Horn and Brooks indicate that enforcing the correct integrability 
constraint, p y — q x = in gradient space coordinates, does not immediately yield 
a convergent relaxation scheme. However, as Frankot and Chellappa (Frankot and 
Chellappa, 1987) point out, using a different penalty term instead may yield a non- 
integrable set of surface normals. The resulting set of normals may be a smoothly 
chosen set of unit vectors, but may not be surface normals for any possible two- 
dimensional surface. 

Pentland takes a different approach (Pentland, 1982, 1984). Rather than assum- 
ing full knowledge of the reflectance function, which is not available for human vision, 
he makes slightly less restrictive assumptions about the reflectance function (e.g. it 
is Lambertian, but with unspecified direction), and makes more assumptions about 
the surface structure. Pentland shows that if one assumes that the surface at a point 
is spherical, i.e. can be fitted by a spherical patch, then Lambertian reflectance gives 
a unique solution for the surface at that point. 

As one might expect and as Pentland points out, not all local image intensity 
patterns can be accounted for by a spherical patch. If x,y are image coordinates 
and / is the image intensity, then, according to Pentland, patches where I X x/Iyy < 
cannot be fit by spherical surfaces. If we consider the image as a graph of image 
intensities over the image coordinates, such patches correspond to saddles in the 
graph considered as a two-dimensional surface. A solution can still be found where 
the principal curvatures have equal magnitude, but it is no longer a unique solution. 

Pentland recognizes that some spherical patch solutions for certain kinds of image 
data are less likely to be reasonable than others. If one of the second derivatives of 



12 



Chapter 1 Introduction and Background 

the image intensity is zero, for example, the spherical patch solution involves very 
special placement of the light source or the surface patch. A more likely solution 
would have unequal principal curvatures with one of the principal curvatures zero. 
In a man-made environment of cones, planes and other ruled surfaces, these points 
would form regions, but it is not clear that the visual system evolved to be specially 
adapted to these unusual surfaces. On a general smooth surface, these points will 
occur on one-dimensional curves and so are sparse. l 

One problem with this approach is that it is an extremely local analysis. In 
general, the only non-planar surfaces composed completely of equal curvature points 
(i.e. umbilic points) are pieces of spheres. Although one may be able to generate 
a spherical solution patch locally consistent with the image for many points in an 
image, it is not clear that the surface normals of these patches will be able to form 
a smooth surface. Nor is it clear that such a surface would have much to do with 
the original surface imaged. To compensate for the lack of generality of this sur- 
face structure assumption, Pentland is led to statistical methods to estimate surface 
orientation based on correlations in natural images: heuristic estimators for surface 
orientation are proposed. 

Pentland's tools are again those of partial differential equations. A coordinate 
system is chosen, and derivatives are partial derivatives taken with respect to these 
coordinates. Pentland in places changes to a coordinate system parallel to the prin- 
cipal curvature directions, and indicates the change in coordinates with matrices 
when needed. To simplify the expressions, he assumes the surfaces of interest are 
exactly second order, an approach which may lose significant contributions to surface 
derivatives from higher order terms. 



It must be said that the natural environment in which we evolved is not smooth either. Recently, 
Pentland (Pentland, 1986) has suggested making use of fractal models of surfaces to help understand how 
a visual system might handle images of rough or textured images. To do such analyses carefully requires 
even more geometrical sophistication than the smooth approach: problems of differentiability and definition 
loom large. Perhaps because of our own visual experience which seems to give us smooth models of surfaces 
in our world in spite of its true complicated nature, most theoretical vision research has shied away from 
grappling with the complexities involved. 
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Frankot and Chellappa (Frankot and Chellappa, 1987) suggest a way of taking a 
set of non-integrable surface normals and producing an integrable set by projecting 
the surface normals onto the nearest set of integrable normals, where "nearest" is 
defined by a global integral distance measure defined in the gradient space coordinate 
system (and dependent on that coordinate system). Together with a method derived 
from Horn and Brooks, 1986, they construct a more quickly convergent algorithm 
guaranteed to form an integral set of normal vectors, and suggest their method could 
be applied to Pentland's normals to generate an integrable surface. Here too all the 
development is done in a single coordinate system, the gradient space coordinate 
system. 

Pentland suggests classifying points based on the image derivatives (Pentland, 
1984). Haralick and co-workers (Haralick et al, 1983) have carried this out more 
fully, looking at the surface formed by the graph of image intensities versus image 
coordinates. They define a set of features — peaks, pits, ridges, ravines, saddles, flats, 
and hillsides — in terms of the first and second derivatives of the image intensity that 
can describe each point on the image surface. Hillsides are defined as points that 
are not any of the other features, and will be the label that most points in an image 
receive. 

Haralick et al consider the notion of invariance to various transformations as 
important. They point out that one of the advantages of this set of labels is that 
the labeling is independent of monotonic transformations of image intensity values: 
pits stay pits, and so forth. In addition the features are defined in terms of the 
gradient of the image intensity and the second derivative of the image intensity, 
both of which can be considered as tensors with invariant definitions with respect 
to changes in the image coordinate systems. Although hillsides could be further 
classified, these subclassifications do not remain invariant under monotonic image 
brightness transformations. 

Haralick et al show a procedure for deriving these labels from an image using 
cubic polynomial patches fitted to the image and then analyzed for features. Pong 
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et al (Pong et al, 1985) begin analytical work on this representation by applying it 
to synthetic images of simple solids, calculating theoretically where features should 
lie on the image and comparing this to the features derived directly from the image. 
Koenderink and van Doom (Koenderink and van Doom, 1980) have looked at 
the field of isophotes, or constant brightness contours, of an image and discussed 
some of their geometric features. They note that the Weingarten map, which maps 
the surface in space to the Gaussian sphere of unit surface normals, maps constant 
brightness contours on the surface to level sets of the reflectance function on the 
Gaussian sphere. In order to make sense of the infinity of possible arrangements 
of constant brightness contours potentially created by a given reflectance function, 
they restrict themselves to Weingarten maps that are "stable," or generic, meaning 
their fundamental geometry is not changed by small perturbations. This still leaves 
the wide class of Weingarten maps that are mostly one to one, with one-dimensional 
curves of points that are fold singularities of the Weingarten map, and isolated points 
that are cusps of the Weingarten map. 

They classify various regions of such a surface using parabolic lines (folds in the 
Weingarten map), and discuss different causes for critical points in the image: spec- 
ularities, certain points on the parabolic lines, and critical points of the reflectance 
function itself. These latter two cases will be discussed in more detail in Chapter 5. 
Blicher (Blicher, 1983, Blicher, 1985) has used some of the modern language of 
differential topology to discuss stereo matching and other correspondance problems 
and their limitations as well as examining invariant features of a grey-level image. 
Chen and Penna (Chen and Penna, 1986) have used some language from modern 
differential geometry to state certain ideas about the study of photometric stereo 
and non-rigid objects. 

Several difficulties and issues are common to the motivation and execution of 
shape from shading methods. One problem is the attention focused on representa- 
tion. In order to test methods for deriving shape from shading, some coordinate 
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system must be chosen in which to represent various parameters of the surface. The 
computer must work with numbers, and numbers for surface calculations come from 
coordinate systems. The researcher typically uses this same coordinate system to 
derive and reason about methods. A single coordinate system is typically not able 
to cover an entire surface, and so there are problems with orientations that happen 
to fall outside the span of the major coordinate system. Algebraic tractability be- 
comes a major factor in looking at results: with some coordinate systems, certain 
expressions become very complicated and are discarded as infeasible. 

Modern differential geometry and global analysis provide tools for studying sur- 
faces directly and independently of coordinate systems. This is not to say coordinate 
systems are not useful; rather, much of the geometrical background may be defined 
using properties intrinsic to the surfaces involved, and a coordinate system chosen 
to exploit other particular features of the problem at hand. 

Enforcement of integrability is another issue common to different shape from 
shading methods. Most of the methods struggle to ensure that the surface normals 
produced are from a true two-dimensional surface, recognizing that not all collec- 
tions of surface normals can be considered as coming from a surface. Frankot and 
Chellappa (Frankot and Chellappa, 1987) make the most explicit separation between 
the two groups of surface normal distributions, those that are integrable and those 
that are not, projecting one group onto the other. In other methods, some "integra- 
bility constraint" is frequently added in as a "regularization" term that is supposed 
to roughly deal with getting a smooth surface as a solution. Unfortunately, solutions 
derived without strictly enforcing integrability may still not be integrable. 

As we shall discuss in Chapter 5, our results suggest integrability of a solution 
surface containing a critical point comes from the fact that it is an invariant smooth 
manifold containing the critical point. The algorithms proposed in Chapter 5 based 
on the image dynamical system attempt to find these invariant manifolds directly. 
Integrability of the corresponding surface normals comes for free in the noise-free case, 
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and can be used to monitor the progress of the algorithm in locating a reasonable 
solution when the image is noisy. 

The question of uniqueness of solution for the different methods has also proved 
difficult. One would like some knowledge about how many interpretations of an 
image are possible given the information in the scene. Bruss (Bruss, 1980) found 
a uniqueness result for Horn's method in the case of a reflectance function of a 
particular form: essentially those with level surfaces that are concentric ellipses in the 
gradient space coordinate system. Many reflectance functions do not have this form, 
however. Deift and Sylvester (Deift and Sylvester, 1981) investigated in some detail 
uniqueness results for the degenerate case of an image of a Lambertian hemisphere 
lit from the viewing direction. They found different classes of non-spherical, even 
non-symmetrical, local solutions which are C 2 almost everywhere. If the solution 
surface is required to be C 2 everywhere, the expected spherical solutions are unique. 
They used methods of functional analysis, working in a polar coordinate system 
to analyze this specific case, and did not address questions about stability of the 
unusual solutions. General results on uniqueness and properties of solutions are 
lacking from the literature. Brooks (Brooks, 1982) discusses the general problem of 
ambiguity of solutions surfaces in images, and shows families of solutions for certain 
degenerate cases, e.g. a plane and hemisphere lit from the viewing direction. He 
also briefly examines the relationship between uniqueness of solution and the kind of 
image patch in the case of a hemisphere: certain patches of the image provide much 
more constraint than others. 

Our results in Chapter 5 indicate that generically there will be at most four and 
at least two smooth solution surfaces through a patch of a smooth image containing 
a certain type of critical point due to a known reflectance function. These solutions 
correspond to the different invariant manifolds of the critical point seen as a critical 
point of the image dynamical system: the stable and unstable manifolds on which 
the image dynamical system has source and sink behavior, and potentially two other 
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invariant manifolds on which the image dynamical system has saddle behavior. The 
critical points in the image for which this result holds are those due to local reflectance 
function maxima (the usual case) or minima; we call these very good critical points 
of the image. 

The stability of scene interpretation to variations in assumptions has not been 
fully explored. Ikeuchi and Horn (Ikeuchi and Horn, 1981) did a number of ex- 
perimental tests of the performance of their shape from shading algorithm under 
violations of the assumed conditions, but this seems rare in the published literature. 

In Chapter 5 we test our algorithms with noise and against errors in the re- 
flectance function by looking at the effect of incorrect light source directions. In 
Chapter 7 we suggest some routes for future theoretical exploration of stablity of 
solution surfaces under variations in assumptions. In Section 5.3 we discuss the fun- 
damental instability of the global image dynamical system: generically, the invariant 
manifolds of various critical points in a dynamical system do not flow smoothly to- 
gether; however, an image due to a physical surface and a known reflectance function 
generates an image dynamical system which has the physical surface as an invariant 
manifold through all the critical points of the system. This creates a potential prob- 
lem for computational solutions, since non-generic dynamical systems can be very 
difficult to simulate. However, this also suggests a potential answer to the problem 
of determining the reflectance function (or a number of parameters of it): the wrong 
choice of reflectance function is almost certain to prevent invariant manifolds from 
flowing smoothly into one another, and so may be rejectable on these grounds. 

1.3 Overview of the Thesis 

In the rest of this thesis, we explore in more detail the first order partial dif- 
ferential equation used by Horn for the shape from shading problem. As indicated 
above, one can classically solve such a problem by solving a related set of ordinary 
differential equations, the characteristic equations. For a typical space- invariant re- 
flectance function such as the Lambertian, the one-dimensional curves which solve 
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the characteristic equations are trajectories through (x,y,p, q) space, where p and q 
are gradient space coordinates giving the orientation of the surface. These are the 
characteristic trajectories or strips. 

In Chapters 2 and 3 the shape from shading problem and the image irradiance 
equation are placed in a coordinate- independent and geometrically inspired setting. 
Coordinate independent definitions of the bounding contour (edges of the image due 
to the surface rolling away from the viewer) and the behavior of the image at the 
bounding contour are discussed. In Chapter 4 we show a somewhat dauntingly alge- 
braic but coordinate-independent approach to deriving the characteristic equations. 

A modern approach to the study of nonlinear ordinary differential equations 
like the characteristic equations considers them as a dynamical system. A dynamical 
system is essentially a vector field on some space; trajectories of the dynamical system 
are parameterized paths in the space that have derivatives equal to the vector field 
everywhere along them. For example, if the vector field represents the velocity at 
each point of fluid flowing in a pipe, the trajectories will be the time-dependent 
paths of points flowing along with the fluid. We can speak of the flow of a vector 
field as the collection of all the trajectories together. In Chapter 4 we consider the 
characteristic equations of the shape from shading problem as a vector field defining 
an image dynamical system. 

In Chapter 5 we examine critical points of the image and the image dynamical 
system. The qualitative study of dynamical systems emphasizes the role of critical 
points, places where the vector field is zero, in determining the overall structure 
of the flow lines. If the critical point is not "degenerate" in some sense, then the 
behavior of the nonlinear dynamical system will be quite close to the behavior of a 
linear approximation to the dynamical system near the critical point. It turns out 
that critical points of the image intensities due to critical points in the reflectance 
function give rise to critical points in the image dynamical system. 

Another important tool in the modern study of dynamical systems discussed in 
Chapter 5 is the idea of invariant surfaces for the system. For any point on such a 
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surface, the trajectory of that point (perhaps for a small time interval) also lies on 
the surface. Smooth solution surfaces for the shape from shading problem must be 
invariant surfaces of the image dynamical system: they are two-dimensional surfaces 
made up entirely of segments of characteristic trajectories. 

The two ideas of critical point and invariant surface come together quite literally 
at a critical point. The set of all points whose trajectories end up at a particular 
critical point as time goes to infinity form an invariant manifold called the stable 
manifold of that critical point; the set of points which end up at the critical point as 
time goes to negative infinity form an invariant manifold called the unstable manifold. 
Both the stable and unstable manifolds contain the critical point. There may also 
be other invariant manifolds containing the critical point. For hyperbolic critical 
points (defined in Chapter 5), for example those generically caused by maxima and 
minima of the reflectance function, near the critical point all these invariant manifolds 
are quite close to the invariant manifolds of the linearized dynamical system — in 
particular, the number of invariant manifolds and their tangents at the critical points 
are determined by the linearized system, as well as the type of flow restricted to the 
invariant manifold. As discussed in Chapter 5, in the shape from shading problem 
we can use the fact that there are usually only a finite number of invariant manifolds 
around very good critical points to get existence and uniqueness results on patches 
of the image around the critical points. 

In Section 5.3 we discuss how we can use the dynamical system itself to help 
find certain of the invariant manifolds around a critical point. Essentially we float 
an infinitely distensible surface in the flow of the vector field. If we are assured that 
some point on this surface will flow to the critical point (i.e. the initial surface cuts 
the stable manifold), and the rest of the points on the initial surface are allowed to 
follow the trajectories through them, then the surface will be stretched and deformed 
over time to approximate the unstable manifold near the critical point. 

In Section 5.3 we give a couple of examples of highly parallel algorithms based 
on this idea, and analyze their performance in the presence of noise and mistaken 
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reflectance functions by implementing the algorithms on a 16K CM-1 Connection 
Machine. The methods turn out to be stable in the presence of noise, and robust in 
handling errors in the reflectance function. 

Chapter 6 discusses the role of the bounding contour in constraining shape from 
shading solutions. The results suggest that the bounding contour does not provide 
"patch" constraints of the kind given by critical points, but does provide constraints 
on the reflectance function. 

Finally, Chapter 7 summarizes the results, and discusses future extensions, in- 
cluding caveats and suggestions for implementing a system to connect the patches 
of solutions generated by the local analyses made here. 

We begin looking at the shape from shading problem with global analytic tools 
by discussing the image projection map in the next chapter. 
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Chapter 2 
Image Projection 



In this chapter we look at how the image projection takes points in space and 
maps them to points in an image, and some of the consequences of this mapping. We 
begin with a discussion of some of the foundations of modern differential geometry, 
the concept of a manifold and a tangent bundle, and follow with an analysis of the 
image projection map, the bounding contour, and suggest possible extensions to time 
varying images. 

2.1 Mathematical Preliminaries 
2.1.1 Manifolds 

In this section a few of the basic tools and ideas from modern differential geometry 
are presented. A good introduction to the modern approach to differential geometry 
can be found in (Spivak, 1979); a brisker introductory approach with connections to 
dynamical systems and other areas of modern physics can be found in (Abraham, 
Marsden, and Ratiu, 1983). The reader is referred to either of these for the technical 
details omitted here. 

In both classical and modern differential geometry, the notion of coordinate in- 
variance plays an important role. In the classical view still adopted by many physics 
textbooks, a formula that maintains its character after a coordinate change is in some 
sense an intrinsically defined object: if we replace a coordinate system (x, y, z) with a 

23 



Chapter 2 



Image Projection 




Figure 2.1. Equivalence classes as the partition of a set. 



coordinate system (x', y', z'), a formula written with coordinates x, y, z becomes a for- 
mula in the functions x(x' ,y' , z'),y(x' ,y' ', z 1 ), z(x' ,y' , z')\ if, after simplification, the 
resulting formula looks exactly the same as the original, replacing the symbols x, y, z 
with x', y', z', then the formula is considered invariant to this coordinate change. 

2.1.1.1 Equivalence Class Approach 

A modern way to approach coordinate invariance is to use equivalence classes. 
A set X is partitioned into equivalence classes when it can be divided up into a 
group of non-overlapping sets, Xi, such that UXi = X, X{ D Xj ; = if i ^ j (Figure 
2.1). This gives rise to a number of useful properties. Note that every element of 
X is in one of the equivalence classes; we can refer to the set containing x as [x]. If 
[x] = [y], and [y] = [z], then we must have [x] = [z], since the two equivalence classes 
[x] and [z] share an element, y, so their intersection is non-empty; since the partition 
is non-overlapping, this must mean [x] = [z]. Clearly, if [x] — [y] then [y] = [x]; also 
[x] = [x]. 

These last three properties are those of an equivalence relation: an equivalence 
relation on a set X is a predicate ~ acting on two elements of the set with the 
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properties that x ~ x is true; x ~ y true implies y ~ x; and finally x ~ y and 
y ~ z imply a: <-- z. Equivalence classes and equivalence relations are essentially the 
same idea: given an equivalence relation, we can define the set [x] to be the set of 
all y G X such that y ~ x. The equivalence relation properties make the collection 
of sets {[x]} so defined into a collection of equivalence classes on X. Similarly, given 
an equivalence class partition of X we can define the equivalence relation x ~ y as 
true if and only if [x] = [y], and can refer to the collection of equivalence classes as 
Xj ~ or as X. 

Why are equivalence classes useful? They become very useful when properties 
or operations on the individual elements give rise to properties or operations on the 
equivalence classes considered as elements of the collection of equivalence classes. 
Essentially, the properties and operations that stick around after moving to equiv- 
alence classes can be considered as invariant under the equivalence relation defined 
by the equivalence classes. Consider a function / : X — » X. l If we can define a 
map / : X — ► X such that the functional diagram 

X -L X 



x -U x 

commutes, 2 where tt : X — > X is just the equivalence class map x I-* [x], then in 
some intuitive way we can consider / as "really" acting on the equivalence classes 
of X. The equivalence classes provide a way of looking at some of the behavior of 
/, how it moves elements between equivalence classes, while ignoring other parts, 
how it moves elements within an equivalence class. An extreme example might be if 



The notation / : A -* B means that / takes elements of the set A and gives values in the set B; thus, 
for a € A we have f(a) 6 B. 

A diagram like that above defines composition functions between the different domains. For example, 
t is a map from X to X, f is a map from X to itself, and f o ir defines a map from X to X whose value 
at p is f(ir(p)). The diagram is said to commute when following two sets of arrows from the same starting 
set to the same finishing set gives the same function: i.e., f o w = x o f. 
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Figure 2.2. The modern manifold: charts of an atlas. <j>i and 4> 2 are charts mapping open sets Ui and U2 
on M into IR 2 . 

/ = id, that is f([x]) = [x] for all [x] G X. This means that / does not move elements 
of X outside of their equivalence classes under ~ at all; in this case the equivalence 
classes of X are actually invariant sets of /. 

2.1.1.2 Manifolds 

The modern definition of a surface, or manifold, essentially relies on equivalence 
class definitions to get away from individual coordinate representations. First let us 
look at features of the modern definition of a manifold. 

The idea is to model the local behavior of the surface by coordinate charts (maps 
from the manifold to IR n ) which behave "reasonably" on pieces of the surface that 
are acted on by more than one chart (Figure 2.2). We start by requiring the manifold 
M to have an atlas of charts, meaning that there is a set of open neighborhoods U a of 
M that cover M and a corresponding set of invertible maps <f> a : M — ► IR" mapping 
each patch onto IR n . 3 These coordinate patches and coordinate maps give us local 
windows onto the surface. One analogy is a collection of television cameras covering 

This is for an n-dimensional manifold. One can use other kinds of linear spaces as so-called model 
spaces in place of IR" to define infinite dimensional manifolds. This can be used to make the calculus of 
variations into a true calculus with derivatives, etc. (Marsden and Hughes, 1983) 
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Figure 2.3. The sphere and two charts, <j> and i/>. 



a sporting event: when you look at the wall of television images at the studio, 
you are seeing different views of the same event. 4 We do need some assurance of 
consistency between the images: someone might have changed a channel. In the 
mathematical realm, we can require the charts to be consistent on their overlap: if 
we have <}>\ : U\ — ► IR™ and <fo : #2 — ► BFl" as coordinate charts on M, we can define 
a new map too^ 1 : V\ — > V2, where V\ = (f>i(U\C\U2) and V2 = 4>2{U\ (MJ2) are both 
sets in IR W . For consistency, we can require this overlap map between two real spaces 
to be continuous with continuous inverse, or C k with C k inverse, or smooth (C°°) 
with smooth inverse — whatever degree of smoothness we are interested in using to 
define and work with the manifold. 

As an example, consider the two dimensional sphere embedded in IR 3 (Figure 
2.3). One coordinate chart we might use is the map <f> : {(#, y, z) € S 2 \z > 0} — * IR 2 
given by <f>(x, y, z) = (a:, y): this is orthogonal projection onto the x-y plane. A point 



Marshall McLuhan would approve of our approach: our perception of an event is often denned these 
days by the different views provided by different television cameras. 
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on the sphere is given by (x, y, -\/l — x 2 — y 2 ) € IR within the domain of this chart; 
in fact, <f>~ (x,y) = (x,y, >/l — x 2 — y 2 ) since <f> o <j>~ 1 = id using these definitions. 

We could also take the chart ij> : {(x,y,z) £ S 2 \x > 0} — >• IR given by 
tp(x,y,z) = (y,z). Are ip and (f> compatible on their overlap? We look at i\) o </> _1 on 
the appropriate domain of IR 2 : we have 

V> o (fT (r, s) — 0(r, s, v 1 — r 2 — s 2 ) 
= (s, Vl — v 2 — s 2 ). 



Is the map (r,s) k-> (5, \/l — r 2 — s 2 ) smooth? Here we have to be careful to include 
the domain on which we are working: we are interested in the smoothness of this 
map not on all of IR 2 , but only on <f>(Ui C\ U2) where U\ = {(x, y, z) £ S 2 \z > 0} and 
U 2 = {(x,y,z) e S 2 |a;>0}. We have <f>{U x n U 2 ) = {(r,s)\r > ,r 2 + s 2 < 1}. We 



can now look at the smoothness of the overlap map (r,s) 1— >• (s, \/l — r 2 — s 2 ) on the 
overlap: the Jacobian of this map is 

1 



7 VI - r 2 - s 2 -s/VT^ 



=,2 



s 

and indeed for r > 0, r 2 + s 2 < 1, this Jacobian is smooth and is smoothly invertible; 
hence the overlap map is smooth and smoothly invertible, and the coordinate charts 
are smoothly compatible. 

Given one such atlas of charts for a manifold, we can now throw into the atlas 
all other charts that are compatible with it to create a maximal atlas. We include 
any other open neighborhood U on M with associated invertible map <?!>:[/—>■ IR n 
which is compatible with all the original charts. If we have two of these new charts, 
<f> and tf>, they will then be compatible with each other: on appropriate subdomains, 
we have 

for each i, where <f>{ are the original charts for the manifold; since the original atlas 
has neighborhoods covering the whole manifold, we can decompose the intersection 
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Figure 2.4. The geometric tangent space T P S is a plane in IR 3 tangent to the surface at p. 



of the domains of <f> and tj> completely into patches within different £/,-. Since ift and 
4> are compatible with all the <f>i by definition, and the compositions above have the 
same degree of smoothness, we have tp and <f> also compatible. 

A manifold is defined as a set of points with an atlas of charts compatible on 
their overlaps. The modern way to work with a manifold M is to always be thinking 
about this potentially quite abstract object while working with useful views of it 
provided by the coordinate charts in an atlas. The same manifold M may look quite 
different using different coordinate charts. Formulas in the two coordinate systems 
may look very different but refer to the same operations or points on the manifold. 
For example consider a plane parameterized using polar coordinates (r, 6) or cartesian 
coordinates (x,y). An arc of a circle can be given by r = c or by y = \Jc 2 — x 2 , 
depending on the coordinate system. The choice of a particular coordinate system 
can be made solely to simplify a particular calculation involving the arc. The arc is 
a geometric object independent of the charts used to view it. 
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2.1.2 Tangent Spaces 

We have defined a manifold intrinsically, that is without requiring the manifold 
to sit "inside" a larger space. 5 We can define the tangent space at a point on the 
manifold intrinsically as well. At first glance this seems unreasonable: our intuition 
about what a tangent space is typically comes from the embedded picture of a two- 
dimensional manifold in three dimensions (Figure 2.4): we think of it as a plane 
just tangent to a point on the surface, and this plane clearly sits in the full three- 
dimensional space. However, using equivalence classes we can dispense with the 
embedding space and still maintain the idea of the tangent space as a kind of linear 
approximation to the original surface. 

As with many good ideas in mathematics, there is more than one way to define 
the tangent space intrinsically, and all of these are equivalent. One can define it as 
equivalence classes of curves on the surface M, or as equivalence classes of derivative 
operators on real-valued functions of the manifold M, for example. Here we will 
discuss another approach tied more closely to the coordinate charts themselves. 

Let us consider two charts in an (assumed) smooth atlas for a manifold M, 4>\ and 
<f>2, with a non-empty overlap on M. The overlap map /?i2 = <l>2 ^i* 1 on ^1(^1 ^ ^2) 
is smooth by assumption. The derivative of /?i2, Dfli2, at a point <j>\{p) € IR n defines 
a linear map from IR™ to lR n . 6 If we pick V\ as a vector in IR n , then we can consider 
V2 = D(3\2(vi) as related to v\ through the overlap map /?i2- We can make this 
into an equivalence relation in the following way: consider all ordered pairs (<f>i,Vi) 
of charts 7 and vectors in ]R n . We define the equivalence relation ~ on this set as 



Whitney's theorem (Guilleman and Pollack, 1974) says that any manifold can be modeled as a sub- 
manifold of a higher dimensional space; it is not always conveniant or reasonable to do this, however; e.g, 
curved space- time does not necessarily sit "inside" some other larger "hyperspace". 

The differential Df x of a function / : IR m — ► IR" at a point x € TR-™ is the unique linear map such 
that \\f(x + h)- f(x) - Df x (h)\\ is 0(||A 2 ||) for all h € IR". See (Abraham, Marsden, and Ratiu, 1983) or 
the more introductory (Lang, 1968) for details: it is essentially the Jacobian without a particular coordinate 
choice. 



Including domains, although we don't write them for clarity of notation. 
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((f>, v) ~ {ifr,™) if and only if w = D(tf> o <j>~ l )v at p € M; i.e. w and v are related 
by the derivative of the overlap map. To see that this is an equivalence relation, we 
have to verify the equivalence relation properties: 

1) ((f), v) ~ ((f), v): we have the simple overlap map <f>o<j)~ l = id, soi; = D(id)(v) = 
id(v). 

2) ((f), v) ~ (0,w) implies (i}>,w) ~ ((f), v): from the first relation we have w — 
D(xf> o <f)~^)(v). Using the inverse function theorem and the smooth invertibility of 
if) o </> -1 , this means that D(<f> o tf>~ 1 )w = v, so that (if),w) ~ ((f), v). 

3) If ((j)\,v\) ~ ((f)2,V2) and (^2,^2) ~ (^3,^3), then ((f)\,v\) ~ (^3,^3): we have 

i>2 = D((f>2 o 4>\ )^i 
^3 = D(h ° 4>2 1 ) v 2- 
We can compose the overlap maps and use the chain rule to get 

v 3 = D(h o (f)2 l ) D(h o ^f 1 )^) 
= J D(^ 3 o^ 1 )(t, 1 ), 

SO (<f>l,V\) ~ (<^3,U3)- 

We can now define a tangent vector for M as an equivalence class [((f), v)] under 
the above relation. We think of ((f), v) or just v as the coordinate representation by 
<f> of the tangent vector. Why is this a reasonable thing to do? Again we can think 
about the charts as windows onto the real object we are looking at. If a tangent 
vector v of M is to be reasonably defined, we expect that the "snapshots" of v in 
different charts should be nicely related; what could be nicer than to have them 
related directly by the overlap maps that connect the different pictures? 

We can see how this works by looking at tangent vectors to curves on an n 
dimensional surface embedded in ]R TO (Figure 2.5). A curve on the surface is given 
by a map a : IR — * M from an interval of the real line to the manifold. Since 
M C IR m , a describes a curve in IR m , and we can define its derivative at any 
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4> 




Figure 2.5. Connection between modern and geometric definition of tangent vectors. Here a(s) is a curve 
in 5 with tangent vector v at p. It is mapped by the chart <f> to a curve in IR 2 with tangent vector 

D4> P M = (3/3*M"(*))lt=o. 
The set of image vectors D^>p(v) for all charts <j> is the equivalence class definition of the tangent vector. 



point along the curve in space by the usual time derivative a'(t). 8 Now let us use 
a coordinate chart <f> for M containing a piece of the curve a: we can look at the 
image of the curve under the coordinate chart, <f> o a(t); this is a curve in the real 
space IR n . The time derivative of this combined map gives a vector ^((p o a)(t) in 
IR n which can be thought of as the image of the tangent vector a'(t) under the map 
</>; indeed we would like to write 

j 

—(<f>oa)(t) = D<f>oa\t) 

using the chain rule if we could make sense of D<f> on the manifold M. 



Note that in terms of the derivative operation, D, a'(t) = Da t (l). 
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If we have another chart tp on the same area of M, we have a different image of 
the tangent vector a'(t): we have ^(^oa)(f). How are these two projected coordinate 
vectors related? Let us look at t — 0. We define 

ai(t) = 4> o a(t) 

oi2(i) — V" ° a W 
«i = ai'(0) 
t; 2 = a 2 '(0). 
We can use the overlap map to relate the two curves ct\ and a 2 : we have 

ip o (jT l o ai(^) = 02(0 

(ignoring domain details), so taking the time derivative at t = we have 

j 
— (Vjck^- 1 oori)(0) = v 2 

D(i>o <^ _1 )(ui) = v 2 . 

Since o cf>~ 1 : ]R m — >• Ht TO is a map between vector spaces, D(tp <^ -1 ) makes sense, 
and we have used the usual real vector space chain rule on the composition of real 
vector space maps (ifi o <^ -1 ) o (ai). From the definition of the vector equivalence 
relation given earlier, we have (</>, v\) ~ (Vs^2), or [(</>, ^1)] = [(Vs^)]- The members 
of the equivalence class [(</>, v)] are indeed views of the geometric tangent vector in 
different coordinate systems. 9 

From this we see that when the manifold is embedded in a vector space, the 
equivalence class definition of a tangent vector puts together all the coordinate im- 
ages of a "true" tangent vector into one equivalence class. We avoid the need for an 
embedding space in general by defining a tangent vector to be the set of all possi- 
ble related image vectors under all possible consistent charts where the relation is 
through the overlap map. This is a bit like defining a tennis player's stroke as all 



For technical details see (Spivak, 1979). 
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possible television images of it: we do not need to postulate the existence of Rod 
Laver in order to admire the stroke and analyze it in detail; all we need are all pos- 
sible different views of it. Vision can be considered as giving us two time- varying 
charts of the two-dimensional surfaces around us; judging our reaction to animated 
artificial figures, the available views alone can be very powerful in determining our 
perception of reality. 10 

We can do more with the equivalence class definition of tangent vectors at a 
point on the manifold. The equivalence classes themselves form a vector space: for 
a a scalar we can define 

a[(<t>,v)\ = [(</>, av)] 
W, v)} + [(V>, w)) = [(<£, v + D(<j> o V*" 1 )«,)], 
where the slightly complicated expression for the sum just reflects the need to move 
the representative vectors into the same chart before adding them; we have [(<£, v)] + 
[{4>i w)} = [((j>, v + w)] when the representative vectors are in the same coordinate 
chart. The chain rule and linearity of the derivative map on vector spaces can be 
used to show that these definitions make sense; for example, if (<f>\,v\) ~ (^2,^2) 
then we know 

V2 = D(<f> 2 o<l)- l )(vi), 

so 

av 2 = D(<^ 2 o^ 1 )(au 1 ), 

where a is a scalar, and hence (<f>i,avi) ~ (<f>2,av 2 ) so that the definition of a[(<f>,v)] 
does not depend on the choice of representative (<f>, v) for the equivalence class. Es- 
sentially, the basic vector space structure of IR n is preserved by the mapping from 
representatives to equivalence classes even though individual identities of represen- 
tatives are (usefully) confused by the mapping. 



Jan Koenderink has recently proposed a representation scheme for the human visual system based on 
coordinate independent objects (Koenderink, 1988). 
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We will call the vector space of all equivalence classes of representatives the 
tangent space T p M at p of M. Given a map / : M -> N taking elements of the 
manifold M over to the manifold N, we can define a linear derivative map Df p : 
T P M —> T q M taking elements of the tangent space of M at p to elements of the 
tangent space of N at q = f(p): 

Df p v = Df p ([<j>, „]) = ty, D^ofo r'Wl 

where ^ is a coordinate chart on N and <f> is a coordinate chart on M . Essentially, we 
are "writing the map / in coordinates" when we look at ip o /o <jr l : we use the chart 
<f> to examine a piece of the domain and the chart V> to examine a piece of the range. 
Note that tpofo^' 1 is a map from IR" to IR m (assuming N has dimension IR m ), and 
so the derivative of this real vector function is the standard vector space derivative. 
The definition of Df as an equivalence class mapping makes sense because the chain 
rule connects different representatives for the same equivalence classes: if is another 
coordinate chart on N, 4> another coordinate chart on M, and [(</>, v)] = [(&«)], we 
have 

[V>, D{%l> o / o (T 1 )^)] = [0, D$ o V -1 ) o D(iP o / o <^ _1 )(t;)] 

= [Vi,^o/o^- 1 )(t,)] 
using the definition of equivalence, and 

[0, D{*l> o / o <l>- l ){v)] = [0, D$ o / o (T 1 ) o D{4> o 4>){v)\ 

= Df p ([(lD(^o4>)(v)] 
= Df,([(lv]) 1 

using the definition of Df p and using the definition of equivalence between [(<f>, v)] 
and \(4>, »)]. Thus the definition of Df p makes sense. We can consider the derivative 
D{%j)ofo<f)- 1 ) as the coordinatized view of Df p in just the way we considered v e IR" 
as the coordinatized version of v = [(<j>, v)]. 

If we take the simplest case where M is the space IR", then the identity map on 
the open "subset" IR n of IR" makes a perfectly good chart. T p TR n at a point p € IR n 
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TS 
P 

Figure 2.6. The tangent bundle to a curve: by turning the 1-d tangent spaces on their sides, we can see 
how TS ~ IR x IR. 

can now be identified with IR" at each p, since we have a single global chart. Even 
if we add other charts to the atlas (e.g. polar coordinates or other things), we can 
still consider this as the canonical coordinate system, and in general we can identify 
n- vectors IR™ as elements of T P IR™ at some p G IR. The full collection of tangent 
vectors for all p G IR 3 , TIR 3 , is equivalent to IR 3 x IR 3 : we consider a tangent vector 
v p G T P 1R 3 to be just (p, v) G IR 3 x IR 3 . 

On a general manifold M, given a coordinate chart <f> : U C M — » IR n there 
are certain vector fields often labeled as special. These are the vector fields whose 
representatives in IR n under the chart are the standard orthonormal basis vectors. If 
we label the components of the chart as <j)(p) = (a; 1 (p), . . . , x n (p)), then the associated 
vector fields are called d/dx % . Using the idea of the derivative of a mapping between 
manifolds, we can define the vectors d/dxi by 

D +> Xez) = ei ' 

where e; = (0, . . . , 0, 1, 0, . . . , 0) T where the 1 appears in the i-th position. 11 



The notation d/dx % comes about because if / : M — + IR is a real function on the manifold M , then 
Df (d/dx') in coordinates is given by D(f o <j>)(ei) = df/dx{, where in the last expression / is assumed 
written in coordinates with respect to the chart (j>. In fact, there is another definition of the tangent space 
that considers tangent vectors as real-valued linear operators on real-valued functions of the manifold. 
(Warner, 1971). 
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In general we can collect all the tangent spaces T p M together and make the 
collection TM = L)T P M itself into a manifold. To do this, we need to find an atlas 
for TM: we can construct one from an atlas for M. We take a chart for M, {U, (f>) 
(remember U is the open set of M on which the chart map <f> is defined), and consider 
the derivative map D<f> p for all p in the open set U: since <j> : U — ► IR n and both U 
and IR n are manifolds, we can define D<f> p as an equivalence class mapping as we did 
above. We can use the map 

ftv„) = Mp),/ty„(v,)) : TU - IR n x IR ra 

as our new coordinate chart on the new open set TU = \J p€ (j T p M of TM; note 
that we have identified D<f> p (v p ) € T p JR n with the appropriate n-vector in IR n . An 
overlap map for this new chart would look like 

Pu(x,v) = (<f> 2 o <j>X l {x),D{4>2 o <t>i l ) x {v)\ 

which has the usual overlap map as the first component, and the derivative of the 
overlap map as the second component; assuming the charts are smooth, for example, 
this combined overlap map is also smooth. 12 Essentially, every tangent bundle of 
an m-dimensional manifold locally (i.e. viewed through a chart) looks equivalent 
to IR m x IR m . Globally this is not the case: for example, there is a result which 
says there can be no completely non-zero vector field on the two-dimensional sphere 
(the so-called Hairy Ball Theorem (Abraham, Marsden, and Ratiu, 1983)); this is 
certainly not true of IR 2 , and so TS 2 and IR 2 x IR 2 are not equivalent, even though 
pieces of TS 2 are equivalent to pieces of IR 2 x IR 2 . 

Basically, we can reason about the tangent vectors defined as equivalence classes 
as if they were the familiar tangent vectors in space on which our intuition is based. 



One can define the tangent space in yet another way through overlap maps alone, considering them 
as "glue" between patches of JR.". 
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Modern differential geometry provides a bag of tools and techniques defined intrin- 
sically (via the equivalence class definitions of tangent space, etc.). One reason for 
using these tools is to ensure that constructions do not depend on a particular co- 
ordinate system; instead, they can be thought of as part of the intrinsic geometry 
of a situation. In practice, in the early examination of a physical problem it is very 
hard to reason usefully without eventually picking coordinates. From a pragmatic 
point of view, the intrinsic definitions of the tools and constructions of differential 
geometry and global analysis means that they are available even if we decide to pick 
bizarre coordinates to exploit special features of the problem under consideration. 

2.2 The Image Projection Map 

As suggested by the previous pictures used to help explain some of the ideas 
of modern differential geometry, vision and geometry have very close ties. It might 
be said that vision provides us with a pair of time- varying charts of the surfaces 
around us, together with brightness information at each point of the image. Par- 
ticular coordinate systems do not have such a close relationship with vision: we 
are not aware of some pre-existing coordinate system in reasoning about surfaces or 
interpreting an image. It seems to make sense to try to use some of the invariant 
language and notation of the modern approach to describe features of the image 
formation and interpretation process. Often coordinate representations of things can 
look complicated and messy even though the underlying geometric idea may not be 
so complicated. If we begin with invariant definitions of some of the fundamental 
ideas, we can move to various coordinates suited for particular analyses of details 
with (hopefully) a minimum of confusion. 

Let us begin by looking at the image projection map. A projection system can 
be thought of as a way of mapping points in IR 3 to points on some two dimensional 
imaging surface 7. 13 (See Figure 2.7.) We can call this map it 1 : IR 3 -» I, and will 

Really we map from some large open set of IR 3 . We will mostly ignore this technical detail. (Blicher, 
1985) tries to keep track of all the different open sets in his definitions, and it can be confusing. 



38 



Chapter 2 



Image Projection 




n' 




^(P) 



Figure 2.7. Image projection from IR to the image /. 
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Figure 2.8. Image projection as a fibre map 



assume it is a smooth map. Typical projection maps used are orthogonal projection 
onto a plane, central projection onto a plane, or central projection onto a sphere 
centered at the projection center. Although central projection onto a sphere perhaps 
best models the physical projection of light onto the retina while central projection 
onto a plane best models projection of light onto film in a camera, it is not clear what 
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Figure 2.9. Embedding a two dimensional surface in IR 3 . 



the "right" projection for vision should be: do we attempt to include the inhomo- 
geneities of the retinal neural distribution in the projection by centrally projecting 
onto some distorted surface? Ideally we would like results that are insensitive to 
variations in the projection system; we will try to work in that direction in what 
follows. 14 

The distinction between images made at a large distance (compared to the focal 
distance) under central projection to a plane or orthogonal (scaled) projection to a 
plane is quite small, so orthogonal projection is often used because its coordinate 
representation is much simpler. We will do the same as soon as the going gets rough. 



One abstract way of viewing projection is to eliminate the imaging surface altogether. We can usually 
imagine the projection situation to be as follows: we have a collection of non-intersecting rays running from 
IR to the imaging surface. The point at which a ray hits the imaging surface defines the image projection 
of all the points on the ray. (See Figure 2.8) The projection can be thought of as a projection from IR 3 to 
the set of rays; the imaging surface just provides us with a convenient way to coordinatize the set of rays, 
since each point on the imaging surface corresponds to a unique projection ray. This perspective suggests 
that the shape of the imaging surface is of secondary importance in understanding the image projection: 
given the projection of IR onto the projection rays, or fibres, we can generate the values of any image 
projection formed by intersecting the bundle of rays with an arbitrary surface. Note that this is not true 
of brightness values at each point on the image: the brightness of a point on the image does depend on the 
orientation of the imaging surface. 
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A two dimensional surface S in IR can be thought of as a two-dimensional 
manifold N embedded in IR by a map i : N — ► IR . 15 In the case of a non- moving 
scene or a still photograph, we can consider i to be the inclusion map, so that N — S, 
and i(p) = p. However, in the case of a moving scene, we may want to model the 
motion as a time varying embedding of some "fixed" surface N, so that we have 
i : N x IR — > IR , with it(N) = St giving the position of the embedded surface in 
space at time t. For various purposes, we might try to specify what i can be: e.g. an 
isometry, a rigid rotation, or a general embedding. We can speak about the velocity 
field in space of each point p € N as (d/dt)it(p). 

As a two-dimensional manifold on its own, N has an intrinsic tangent bundle TN. 
This is related to the usual tangent planes in IR of the embedded surface S = i(N) 
by the embedding map: Di : TN — + 71R 3 takes the two dimensional tangent space 
T p N to the two-dimensional tangent space Tif p \i(N) = T^S. One way to see this is 
to think of paths on the surfaces: a path a : IR — ► N becomes a path i oa : IR — > IR 3 
in space on the embedded surface S. Di then carries the tangent vector a'(t) to the 
appropriate tangent vector on the surface embedded in IR 3 . 

When we view a surface in IR , we can consider how the image projection, K 1 , 
maps the surface to the image. If we consider the two-dimensional surface iV as 
embedded in IR 3 by i : N — ► IR 3 , with i(N) = S, then we can consider the map 
7r o i. This is the map that takes points on the surface, N, to points on the image, 
7. 16 



An immersion is often defined as a map from a lower dimensional space to an equal or higher dimen- 
sional space with the derivative having maximal rank everywhere; an embedding is an immersion with the 
additional property that distinct pieces of the immersed surface are always well separated. Technical details 
can be found in (Abraham, Marsden, and Ratiu, 1983). 

16 Another notation for this is i** 1 = ir 1 o i which is called the "pull-back" of jt 7 by i because i* pulls 
the projection map jr : IR. — ► I back to the surface N. 
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If we allow time- varying surface embeddings, the image of a point on the imaging 
surface (assuming a time invariant projection system) is ir 1 oii{p), and so the motion 
field of the image is given by 



d_ 
dt 



(tt 7 o i t (p)) = Dr 1 o I jjit(p)) ; 



thus, Dtt 1 plays the crucial role of mapping velocity vectors in space to motion field 
vectors in the image. 

Our interest here will mostly be with still images, so that in general we can assume 
that the embedding i is the inclusion map: N is a surface in IR , and i(p) = p, so 
N = S and we can interchangeably refer to i(S) and 5". 

2.2.1 The Bounding Contour 

We can use the general image projection map to define the bounding contour of a 
surface and image. We have a smooth map k o i : N —* I from the two-dimensional 
surface to the two-dimensional image. The image consists of an interior where two- 
dimensional patches of the surface are connected to two-dimensional patches of the 
image, and a bounding contour where the relationship between the surface and the 
image must be more complicated. It seems reasonable that most of the interior of the 
image corresponds to a region of the surface where the map it 1 oi is a diffeomorphism; 
at pieces of the bounding contour (which may include curves in the interior of the 
image), this is not the case. 

To see where w o i is a diffeomorphism, we assume i and ■k 1 are smooth (as we 
shall do for the duration). Using the inverse function theorem, 17 we know that tt 1 oi 
is a diffeomorphism on a neighborhood of p 6 N if D^-k 1 o i) is invertible at p. 



As with many results in modern differential geometry, the inverse function theorem for manifolds is 
analagous to the inverse function theorem for real vector functions, and can be proved by coordinatizing the 
manifolds and applying the real result. Knowing theorems in the real case usually gives the right intuition 
for the manifold case. 
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How likely is it that D^ir 1 o i) is invertible? Sard's theorem says that almost 
all points of ^{S) will be regular values for it 1 o i, meaning D^ 1 o i) will have full 
rank and so will be invertible. What can happen to make D(ir o i) not invertible? 
We can use the chain rule to take this derivative apart: D(ir o i) = Dt o Di. By 
assumption, i is an embedding so that Di always has maximal rank, i.e. rank two. 
Similarly, by assumption we can insist that the image projection ir have full rank at 
all points of its domain in 1R , again rank two. The only way that the composition 
of the two linear maps Dir and Di can have rank less than two is if the range of 
Di intersects non-trivially the null space 18 of Die ; i.e. if there is a non-zero vector 
u (E T P N such that for v = Di p (u) we have Dk 1 i( p ){v) = 0. 19 

We can interpret this in another way using projection fibres (Figure 2.10). A 
projection fibre is a set of points in space all of which project to the same image 
point. If we consider a curve a : I — > IR running along a projection fibre, so that 
7r^(a(i)) = 7r^(a(0)), we have 

= — (tt J o a)(t) = Dir 1 o a'(t). 
at 

Since Dir 1 has rank two at each point of IR , we can consider a unit vector k(a(t)) 
parallel to a'(t) as determining the one-dimensional null space for Dir at each point 
p = a(t). We can do this smoothly over the whole projection domain using a flow 
down all the fibres simultaneously. At each point p 6 IR 3 , k(p) essentially represents 
the local projection direction; in the usual case of projection rays, k(p) will actually 
point towards (or away from) the true image point. If v = Di p (n) is in the null 
space of Dn 1 , as required for tt 1 o i to not be a diffeomorphism, then we must have 
v parallel to k(i(p)). Put another way, in this case the local projection direction is 



18 



The null space of a linear map A consists of all vectors v such that Av = 0. 



One of the reasons for having tangent spaces around is to make liberal use of linear algebra results 
on the tangent space at each point of the manifold. The tangent bundle acts like a theoretical Connection 
Machine for linear algebra. 
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Figure 2.10. The bounding contour and projection fibres. The projection fibres are tangent to the surface 
at the bounding contour. 

parallel to the tangent plane of the surface in IR 3 ; this is what is usually meant 
by the bounding contour, also called the self-occlusion locus, or extremal contour. 
At such points, the projection direction just grazes the surface in space (see Figure 
2.10). 

We now have invariant characterizations of the surface in IR 3 , the image pro- 
jection from IR to the image /, and the bounding contour of the projection. In 
the next chapter we discuss the reflectance function and an invariant version of the 
image irradiance equation. 
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In this chapter we will discuss what the image irradiance equation looks like when 
written in invariant language. Assuming fixed lighting conditions and uniformity of 
surface material, Horn (Horn, 1975) modeled the shading of an image of a surface 
as dependent on the locations and local orientations of points on the surface. This 
suggests that the surface tangent planes are important objects for understanding 
shape from shading. 

We begin the mathematical preliminaries section with a discussion of the space 
C(IR , 2) of all two-dimensional tangent planes in IR and a useful class of coordinate 
charts for it; C(IR , 2) is the space of surface orientations on which the reflectance 
map acts. We discuss how two-dimensional surfaces embedded in IR can be "lifted" 
to be two-dimensional surfaces in the space C(IR ,2). We discuss contact 1-forms, 
which are linear functional on the space C(IR , 2) useful for detecting when a two- 
dimensional surface in C(IR ,2) is lifted from IR . In Section 3.2 we discuss the 
reflectance map and the image formation process. We write the invariant image 
irradiance equation, and show that it is the same as Horn's equation given a certain 
set of coordinates. Finally, we state an invariant version of the shape from shading 
problem. In the next chapter, we tackle how this invariant description of the image 
irradiance equation can be used to generate an invariant vector field on C(IR ,2); 
this is essentially the classical characteristic strip method used by Horn. 
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Possible tangent spaces at p 

Figure 3.1. Set of all tangent planes in space, C(IR 3 , 2). At each point there is a collection of 2-dimensional 
planes, each of which could be a tangent space to a 2-d surface. 

3.1 Mathematical Preliminaries 

3.1.1 C(Ht 3 ,2) 

Consider the surface, S, embedded in IR by the embedding i : S — ► IR . As 
discussed in the last chapter, associated with each point p of S there is a tangent 
plane to S denoted T p S. This can be considered as a two-dimensional subspace of 
the three-dimensional tangent space T p JR . The orientation of the surface S at the 
point p is determined by the tangent plane. 

To discuss orientations of surfaces, it is useful to consider the set C(IR , 2) of 
all possible two-dimensional tangent planes in IR . As discussed in the last chapter, 
the tangent bundle for IR , TIR , is denned as the collection of all tangent vectors 
to IR . TIR is equivalent to the Cartesian product of IR with itself, IR x IR , 
so that a tangent vector v p G TIR is equivalent to a point (p,v) G IR x IR . 
We can think of p x IR as a vector space in its own right isomorphic to IR by 
doing vector operations only on the second component. A tangent plane at p is 
then equivalent to a two-dimensional subspace p x W of p x IR , where W is a 
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two-dimensional subspace of IR . If we define £7(2,3) as the set of two dimensional 
subspaces of IR 3 , we have C(1R 3 , 2) equivalent to the Cartesian product of IR with 
<?(2,3), C(IR 3 ,2) ~ IR 3 x 0(2,3). 0(2,3) is itself a manifold, where each subspace 
is considered a point of the set; it can be given coordinates by picking a coordinate 
chart for the Gaussian sphere of normal vectors, and then identifying two-dimensional 
subspaces of IR with their normal vectors. 1 

3.1.2 C(IR 3 ,2) and a useful coordinate system 

Since both 0(2,3) and IR 3 are manifolds, the manifold C(IR 3 , 2) ~ IR 3 x 0(2,3) 
is as well. If ( Vi, <f>\) is a chart on 0(2, 3), and (V2, <fo) is a chart on IR , then we can 
create a chart (Vi xV 2 ,<£i x </> 2 ) forC(IR 3 ,2) ~ IR 3 x 0(2,3). The usual (x,y,z,p,q) 
gradient coordinate chart based on rectilinear coordinates for IR can be thought of 
this way: it splits into the chart (x, y, z) on IR and a chart (p, q) for orientations, 
where the (p, q) chart does not depend on the point (x,y,z) in space to determine 
the coordinates for orientation. 

We can generalize the (a;, y, z,p, q) coordinates for C(IR , 2) to include nonlinear 
coordinates on IR . Consider (V, x) a coordinate chart for IR , so that V is an 

o 

open set of IR and a; is a chart map defined on V. Pick one of the coordinates, 
say x 3 , as "special." We will construct a coordinate chart for 0(2,3) at each p in 
V. We have tangent vectors ^-\ , -^\ , -^\ to IR 3 at each point p defined using 
the coordinate chart x. These span T p JR . We will define a linear map L p a which 
connects a pair of numbers, a = (01,02), to a two-dimensional subspace of T P TR. 
by mapping elements of the subspace spanned by •! ^|r| , g§r| \ to elements of a 
two-dimensional subspace uniquely determined by a. We define 



Lp : S p an j _ 



d 



;^ 2 



TpIR 3 



A manifold made up of linear subspaces of a vector space is called a Grassman manifold. (Abraham, 
Marsden, and Ratiu, 1983) 
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in coordinates by 
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where (^1,^2) are the coordinates of a vector v G Span j ^|r| , ^frl f with respect 
to the basis vectors gjr| and gp-J , and the coordinates of the range vector in T P TR 
are given with respect to the basis gfr| , gfrL> gfr| • Each choice of a = (01,02) 
gives a different linear map L p a , and the range of L p a is always two dimensional. 2 

We can consider the map L p a as defining a map C : IR 3 x ]R 2 — > C(IR 3 ,2) 
given by £ : (p,a) 1 — ► Range(i p a ). If W p is a two-dimensional subspace of T P TR 3 
such that d/dx 3 is not a member of W p , then there is an a € IR 2 such £(p, a) = Wj,. 
3 If a and a are two vectors in IR 2 with a / a, then £(p,a) / £(p,a), since the 
matrix columns of the two L p a span different subspaces. 



As often happens, there is a trade-off between writing things invariantly and writing things in coor- 
dinates; to write this map invariantly requires the use of inner products with the basis vectors, etc, and 
would not add materially to our discussion. 

To see this, pick two vectors in Wp that span it. Using x coordinates, put these column vectors together 
into a 3 x 2 matrix, A, so that 

A = 



-&■ 



where C is 2 x 2 and d is 1 x 2. We have Range(A) = W p in coordinates. C must have rank 2; if not, then 
the columns of C are dependent, and there is a 2- vector v = (yi,V2) T such that 




implying that d/dx is in the range of A, contrary to our assumption. Since C has rank 2, there are 
independent 2-vectors u and v such that 




multiplying A on the right by a column vector takes linear combinations of the columns of A and hence 
of the columns of C, so we can pick u and v to give this combination of the columns of C since C is 
invertible. The vectors (1,0, ai) and (0, 1, a2) T span the same subspace as the columns of A do, namely 
W p . Defining a = (ai, 02), we have 

C(p,a) = Range(A) = W p . 
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Figure 3.2. Local view of generalized (as, y, z, p, q) coordinates: planes at (0, 0, —1) perpendicular to local 
d/dz direction. The local (p, q) coordinates for the direction of a vector v at p is given by the coordinates 
of the intersection of the line parallel to v through p with the appropriate plane. 

Given the base chart (V,x) we can define a local chart (U, (x, a)) on C(JR , 2), 
where U is the open set of subspaces W p in C(IR 3 ,2) such that d/dx 3 £ W p and 
p e V. We define the chart (x,a) : U C C(IR 3 ,2) — ► IR 3 x Et 2 as the map 
(x,a):W p t — ►£- 1 (W p ). 4 

If the base chart x is the standard orthonormal one, this is essentially the usual 
(x, y, z, p, q) chart of gradient space, where the special coordinate x 3 is z, and (p, q) = 
(01,02). Typically, we would write a surface S in coordinates as (x,y, z(x,y)). By 
looking at velocities of the paths (t,y,z(t,y) and (x,t,z(x,t)) in S, we see that the 
two-dimensional tangent subspace to S would be the subspace spanned by the vectors 
(l,0,z x ) T = (1,0, p) T and (0, \,z y ) T = (0,1, q) T . This subspace is (in coordinates) 
the range of the matrix 



" 1 


" 





1 


_ z x 


Zy _ 



exactly in the form of the matrix for L p a ; taking p = z x , and q = z y makes 

£((x,y,z),(p,<i)) = t p s. 



To avoid a Greek explosion, we use the standard (admittedly confusing) convention of giving the chart 
the same name as the coordinates we are going to use, e.g. x : M —* JR m is a coordinate chart whose values 
are x(p) = x = (x\, X2,xz). This works because a chart <j> : M — * IR m does divide into m real functions 
4>i : M — ► IR such that <j>(p) — (<j>i(p), .. . ,4>m{p))- This useful confusion is also the source of the proper 
definition of Ax, dy, dz, etc.: dz is the derivative map of the coordinate function z : M —* IR, and so is a 
differential one-form on the manifold. 
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A graphic picture of the standard orthonormal (x,y,z,p,q) coordinate system 
comes from looking at normal directions to the tangent planes of S at (x, y, z(x, y)) € 
S. A normal vector to the surface is parallel to the vector (1,0, z x ) x (0, l,z y ) = 
(—z x , —z y , 1) T . The intersection of this direction with the plane 

P = {v|v(0,0,-1) T =1} 

(perpendicular to (0,0, 1) and located a distance 1 down the 2-axis) is (z x ,z y , — 1) , 
so the coordinates (p, q) are the first two coordinates of this point on the plane (Figure 
3.2). 

What we have done is to localize this construction: using the local vector space 
coordinates defined by {d/dx,d/dy,d/dz}, at each point (x,y,z) 6 E we can 
construct a plane perpendicular to the djdz direction at the point (0,0, —1) (where 
the origin is now at (x, y, z)). A two-dimensional tangent subspace at p has a normal 
direction through p which intersects this plane with (local) coordinates (p, q, — 1). 

In the Appendix to this chapter, we show that two charts (V, x) and ( V", x) on IR 3 
generate two charts (U,(x,a)) and (U,(x,a)) on C(IR 3 ,2) which smoothly overlap 
as required to form an atlas for C(IR 3 ,2). 

3.1.3 Lifting of surfaces 

We can now look at how surfaces in IR 3 become surfaces in C(Ht 3 ,2). We have 
a natural map U c : C(Et 3 , 2) — ► IR 3 defined as U C (W P ) = p. If i : S — ► IR 3 is an 
embedded two-dimensional surface in IR 3 , we can "lift" i to be a related embedded 
2-surface i : S — ► C(IR 3 , 2) in C(1R 3 , 2) so that n c o i = i. 

We define the lifted map \:S — * C(IR 3 , 2) by 

i(p) = T i(p) i(S) ~ T p S. 

We will not distinguish between T^i(S) and TpS' unless it is necessary. Essentially 
this new surface in C(IR ,2) includes explicitly the information about the surface 
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orientation: if p € i(S) is a point on the original embedded surface, then the related 
point on the lifted surface sitting in C(M 3 , 2) is defined to be i(p) = T P S, the tangent 
plane to S at p. 

We can look at the coordinate view of this map in the (x,a) chart discussed in 
the previous section. If we take s = (s 1 , s 2 ) as local coordinates 5 on S and x as local 
coordinates on IR , we have in local coordinates i(s) = (x 1 (s),x 2 (s),x 3 (s)). 6 Since i 
is an embedding, we know that the rank of Di is two; without loss of generality (we 
may have to relabel the x coordinates), let us assume that Di has full rank on the first 
two coordinates of x. By the inverse function theorem, we know that i : (s 1 ^ 2 ) i — ► 
(x l (s),x 2 (s)) is an isomorphism, and we can effectively change local coordinates on 
S so that our embedding locally looks like i : (a: 1 ,^ 2 ) i — ► (x 1 ,x 2 ,x 3 (x 1 ,x 2 )). 

Using this coordinate system for S and the coordinate system x for IR 3 , we know 
that 

1 

Di = | 1 

dx 3 /dx 1 dx 3 /dx 2 



in coordinates, so 



1 

Range(Di) = Range | 1 J = T P S 

dx z /dx l dx 3 /8x 2 



Meaning we have a chart s : S — ► JR. given by s(p) = (s ,s ) 

The charts have become almost invisible here. We have maps arranged as follows: 

s -L m 3 

I- 1- 

(s\s 2 )eJR. 2 — (« 1 («),x 2 (a),« 3 («))€lR 3 

Essentially, we have defined (confusingly, but in keeping with common practice) functions x* of the coordi- 
nates (s ,s ) as the coordinates in IR 3 of the embedded surface S. The rationale for this naming abuse is 
that the function x'(s) is really defined as x % (i(s)), where in the latter expression x l is a coordinate function 
on IR 3 . 
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and so from the previous section we see that in the coordinate system (x,a) for 

C(IR 3 ,2) we have 

(1 2 3/1 2 \ \ 

\ x ,x ' x {x ,x h d^'d^J 

as the coordinates for T P S. Note that x does not have to be an orthogonal set of 
coordinates for this to be true. If we write (a? 1 , a: 2 , x 3 , 01,02) = (x,y,z,p,q) as is 
conventional, then the coordinates for the tangent plane to the surface, T p S, consid- 
ered as a point in C(Ht 3 , 2) are just (x, y, z,z x ,z y ). 

3.1.4 Contact 1 -forms 

We would like to be able to detect when a two dimensional manifold S in C(1R 3 , 2) 
is really a lifted two-dimensional surface S from IR 3 . Since C(1R 3 , 2) ~ IR 3 x (?(2, 3), 
we can project any two-dimensional subspace W p C T p IR 3 in C(IR 3 , 2) onto its base 
point p just by taking the first part of the Cartesian product: we have as before 7 

II C :C(IR 3 ,2) — >IR 3 

n c : ( P , w) e m 3 x g(2, 3) .-^ P . 

Clearly we must have T1 C (S) as a two-dimensional surface in IR 3 in order for S to 
be a lifted surface. 

There is another constraint as well: consider the lifted surface given by i : 
S — ► C(IR , 2) where we have assumed the original embedded surface is i : S — > 
IR . Let a : IR — >■ S be a path on the original surface; the lifted path will be 
i o a : ]R -+ C(IR 3 ,2). Taking a coordinate system (x,y,z) for IR 3 and lifting 
it to a coordinate system (x,y,z,p,q) for C(IR 3 ,2), we can consider the surface 
to be given locally as i(r,s) = (r,s,f(r,s)) and the lifted surface to be i(r,s) = 
( r ,s, f(r, s), f r (r,s), f s (r,s)). The curve a(t) is (r(t),s(t)) in coordinates, so using 



7 C(IR 3 , 2) is a so-called trivial bundle over IR. 3 . 
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the chart (x,y,z,p,q) we can write z(ioa(t)) = f(r(t),s(t)). Taking the time deriva- 
tive, using the chain rule, and using the chart (x,y, z,p,q), we can write 

— z{\ o a(t)) = fr-£x(i o a(t)) + /«^y(i o <*(*)) 

= P(i ° "(0)^0 ° «(*)) + 9(i ° a ('))^( i ° a (0)- 
Using differentials 8 of the coordinate functions (x, y, z) on IR 3 we can write this as 

dz o (i o a)'(t) - p(i o a)[dx o (i o a)'(t))] + q(i o a)[dy o (i o a)'(t))]. 

Since we can generate any vector v € T P S by picking an appropriate path a such 
that a'(t) — v, we must have the 1-form condition 9 

(dz — pdx — qdy) o (Di o v) = 

for all v 6 T P S. We can define a contact 1-form in the coordinate system 
(x,y,z,p,q) (i.e. on the open set V C C(1R 3 ,2) on which the chart is defined) 
as 

e (x, y ,z,p,q) = (d* - Pdx - qdy)\ {XjyjZtPt& 

Thus, if i is the lift of i, then for all u € T^i(S) we have 0(u) = 0. 

The contact 1-forms are defined within particular coordinate systems, and as 
such are not defined on the whole space C(1R 3 ,2). However, we can construct the 
differential ideal, /, generated by a set of contact forms defined for a set of charts 
covering the manifold. A differential ideal, /, of differential forms is a vector subspace 



A differential is the derivative of a map from a manifold to the real numbers, in this case the coordinate 
functions. 

For the moment, we are mostly interested in 1-forms, which are just linear maps from T P S to IR: 
derivatives of scalar maps from a manifold are 1-forms, for example. If / is a real function on the manifold 
M and 6 is a 1-form, we define fd(v p ) = f(p)6(v p ) so that f0 is also a 1-form on M. The differentials of 
the coordinate functions form a basis for the linear space of 1-forms at a point p. A differential k-form in 
general is the generalization of a linear idea: at each point p £ M we stick a multilinear (with k arguments) 
alternating map acting on the vector space T P M; if the choice is made smoothly, the result is a differential 
k-form. 
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of differential forms such that for any form £ I and any other differential form p not 
necessarily in I, p A is also in 7. 10 We use a partition of unity to force the contact 
forms to be defined over the whole manifold, and the partition of unity assures that 
at any point on the manifold C(IR 3 , 2), some of the contact forms will be non-zero. 
(See Appendix Section A3.2 at the end of this chapter.) It turns out that any contact 
1-form defined by any coordinate chart (a:, y, z,p, q) on C(IR 3 , 2) constructed from a 
chart (x,y, z) on 1R 3 is a member of this ideal, so that the ideal itself is a coordinate 
independent object. 

If i : S — ► C(1R 3 ,2) is a two-dimensional embedding of S in C(IR 3 ,2) with 
Rank(n c o i) = 2, then it turns out a contact 1-form restricted to \{S) is if and 
only if i is a lifted embedding. The condition that the rank of U c o i be two is quite 
reasonable: if i is the lift of a two-dimensional surface in IR 3 , then the associated 
embedding into IR 3 , i, must have rank two, and from the definition of a lift we must 
have i = H c o i. 

We have already shown in coordinates that restricted to i(S) is if i is a lifted 
embedding. We can show this in a slightly different way by using the notion of the 
pull-back of a differential form: if is a differential k-form on M (so that P is a 
multi-linear map from T P M to the real numbers), and we have i:iV->Masa map 
between manifolds N and M, then we can define a differential form i*0 on N as 

O*0)p(vi, . . . , v fc ) = 0(i( p ))(Di o vi, . . . , D\ o V*) 

where the k-form i*0 acts on vectors vi,...,v k G T P N. If i : S — ► C(IR 3 ,2) : 
p i — ► T p S is a lifted embedding of a two-dimensional surface S and € I is a 
contact 1-form, we can compute i*0 in , s defining coordinate system. We will use 
the coordinate system (U,((x\x 2 ,x 3 ), (a 1 , a 2 ))) = (U,(x,a)) where d/dx 3 £ T p S 

The wedge product is an operation on differential forms directly connected to the alternating multipli- 
cation of multilinear alternating maps on a vector space. It is something like a generalized cross product. 
For details and definitions, see (Abraham, Marsden, and Ratiu, 1983). 
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for all p in the neighborhood U: using an argument similar to one used in the 
definition of coordinate charts for C(JR ,2), we can take coordinates for S so that 
i : S — ► C(IR 3 ,2) is given by i : (x\x 2 ) i— ► (x 1 ,x 2 ,x 3 (x 1 ,x 2 ),dx 3 /dx\dx 3 /dx 2 ) 
in coordinates. We have 

i*0 = i*(da; 3 - a^dx 1 - a 2 Ax 2 ) 

— &{x 3 o i) — [a\ o i)d(a; 1 o i) — (a 2 o i)d(x 2 o i) 

= d(xV ,x 2 ))-|^dx'-gd^ 

= ^1 + «fl^ _ Sfid*. _ ^ 

aa; 1 c/x J da; 1 dx l 

= 0, 

so on a lifted embedding a contact 1-form is pulled back to 0, i.e. the contact form 
restricted to a lifted embedded surface is 0. 11 

On the other hand consider if i : S — ► C(IR 3 , 2) is a two-dimensional embedding 
with i = H c o i also of rank two and i*0 = for some contact 1-form 8. Because 
U c o i is of rank two, on #'s defining coordinate system we can put local coordinates 
on S so that i : (x l ,x 2 ) 1 — ► (x 1 , x 2 , x 3 (x 1 , x 2 ) , a\(x l , x 2 ) , a 2 (x l , x 2 )), and 

(i*0), , =0 
i* (da: 3 - aida: 1 - a 2 dx 2 ) = 

1 1 ^^ 1 2 1 1 1 1 r\ 

^— j-dx + — ^dx — aidar — c^da; = 0. 

Equating coefficients of the cotangent basis {da; 1 , da; 2 , da; 3 } we must have 

da; 3 J dx 3 

ai = _ and a2 = _ 

indicating that i is the lift of i — U c o i. 



For details on rules for calculating with differential forms, see (Abraham, Marsden, and Ratiu, 1983). 
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3.2 The Reflectance Function and Image Forma- 
tion 

Horn (Horn, 1975) makes use of the reflectance function to summarize the influ- 
ences of light source properties, surface reflection properties, and image formation 
physics on the formation of an image from a surface. At each point in space, an 
orientation of the surface tangent plane will give rise to a particular brightness in 
the image defined by the reflectance function R. The reflectance map as described 
by Horn (Horn, 1975) can be considered to act on C(1R 3 ,2): a surface patch with 
tangent plane W at point p in space will generate a brightness in the image given by 
R{P, w )i i- e - R '■ C(JR 3 , 2) — ► Ht. We will often assume that the reflectance function 
does not depend on the location in space: essentially, this removes the dependence of 
image brightness on viewpoint, and is accurate in the limit of long lighting distances 
and long viewing distances. Full generality of the map R includes the projection 
of slides onto surfaces: we do not expect that a visual system is able to deal cor- 
rectly with a completely arbitrary R. Understanding restrictions on comprehensible 
reflectance functions for human shape from shading is an open area. 

An image is a combination of the reflectance function and the image projection. 
From the discussion in the last chapter about the image projection map, we have a 
map k : JR. —* I which generates a map ir 1 o i : S — ► I from points on the surface 
to image points. The brightness at a point 7r 7 (i(/>)) in the image is given by R(i(p)), 
where i : S — ► C(JR , 2) is the lifted embedding of the two-dimensional surface S 
in the space of possible tangent spaces C(IR 3 ,2). If we define E : I — ► IR as the 
brightness of the image at a point in the image, then we have the image irradiance 
equation: 

Eowoi — Roi. 
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Figure 3.3. Curves on the surface through the bounding contour: a. Usual case. b. Curve parallel to 
projection direction at the bounding contour. 

If we can use local coordinates such that 

i : (x,y) G S i — ► (x,y,z(x,y)) 

i : (x, y) € S i — ► (x,y,z(x,y),z x (x,y),z y (x,y)) € C(IR 3 ,2) 
7r J :(x,y,z)em 3 *-+(x,y) £ I, 
R : (x,y,z,p,q) € C(IR 3 ,2) .— ► R(x,y,z,p,q) € Et, 
and 

E:(x,y)el\ — >E(x,y) e R, 

then we can write the image irradiance equation in coordinate form as 

E(x,y) = R(x,y,z,p,q), 
where p = z x and q = z y . This is the image irradiance equation of Horn. 

3.2.1 Cut vs. Rolled Edges 

We can use the reflectance function and the image irradiance equation to under- 
stand the difference between image brightnesses at a cut edge and a rolled edge in an 
image. As we discussed in Section 2.2 the bounding contour of an image consists of 
those points that are the images of points on the surface where the projection rays 
just graze the surface. As discussed in Section 2.2, if i : 5" — >• IR 3 describes how the 
surface is embedded in space, and it 1 : IR 3 — *■ / gives the image projection map, 
the bounding contour consists of points where Dw 1 o Di has rank one, even though 
both D-k 1 and Di have rank two. What happens to the image constant brightness 
contours and the values of image brightness at such points in the image? 
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\ \ 




/ / 

Constant brightness contours 

Figure 3.4. a. Rolled constant brightness contours: they are almost always tangent to the bounding 
contour, b. Cut constant brightness contours: they are almost always not tangent to the bounding contour. 

We can consider the fate of the projection of an arbitrary smooth curve ft : IR — ► 
S onto the image if ft(0) occurs on a bounding contour element. The projected 
curve is a(t) = r oio ft(t), so the tangent vector to the curve in the image is 
a'(t) — Dfa 1 o i) o ft'(t). At a bounding contour element, D^-k 1 o i) has rank one; 
this means the range of D(tt o i) falls in a one dimensional subspace at the image 
bounding contour point — this is the direction tangent to the bounding contour in 
the image. Thus, a'(0) is parallel to the bounding contour, and so the projection 
of ft is tangent to the bounding contour (Figure 3.3a). This hides one potential 
complexity: consider a path ft : IR — ► S which itself is parallel to the projection 
direction at the bounding contour at t = 0: we have D^ 1 o i) o ft'(0) = 0. In this 
special case, the image of the path ft(t) can approach the bounding contour at a non- 
zero angle because the image curve a(t) has a critical point at t = 0: the tangent to 
the geometrical path drawn out in the image is not constrained (Figure 3.3b). 

Let us take (5 : IR — ► 5" as a curve on the surface S which yields a constant 
brightness: this means R(/3) = c, where J3 : IR — ► C(1R , 2) is the corresponding curve 
/? = i o on the lifted surface i : 5* — ► C(IR , 2). The image contour corresponding 
to ft will he a = tt o i o ft; this is now a constant brightness contour in the image. 

The constant brightness contours on the surface, determined by the reflectance 
map and orientation of the surface's tangent planes, intersect the bounding contour 
on the surface at varying angles. As a result, almost all the image projections 
of the constant brightness contours will be parallel to the bounding contour if the 
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discontinuity in the image is indeed due to the surface rolling away from the observer 
(Figure 3.4a). If the image discontinuity is due to the surface being cut along some 
curve, the projections of the constant brightness contours will almost all hit this 
curve at an angle (Figure 3.4b). This provides a clue about whether a boundary of 
image brightnesses is due to a bounding contour or to a discontinuity in the surface. 
The image brightnesses themselves behave badly near the bounding contour. 
Although the image brightness at the bounding contour in the image is defined by 
the reflectance function and the tangent plane orientation at the corresponding point 
on the surface, the derivatives of the image brightness explode as we approach the 
bounding contour. The image irradiance equation can be written as 

Eo(tt C oi) = Roi, 

where we define 7r — tt 1 o Tl c , where U c takes a subspace W p € C(JR , 2) and maps 
it to the base point p, and tt 1 is the usual image projection from IR to the image. 
Taking the derivative of this we have 

DE o D{ir c oi) = DRoDi 

DE = DRoD\o (D(tt C o i))" 1 , 

using the inverse function theorem and the chain rule at image points away from 
the bounding contour so that D(ir c o i) is invertible. As we approach the bounding 
contour, D(ir o i) has an inverse that becomes larger and larger. 12 As a result, the 
magnitude of DE will also almost always increase without limit as the bounding 
contour is approached, even though the value of E is bounded. This is very like the 
behavior of ^Jx as x approaches 0, and in Chapter 6 we shall make this similarity 
quite explicit for generic surfaces. 



If A is an invertible matrix, we have det(.A ) = l/det(j4). As det(A) approaches zero the magnitude 
of A~ must explode, since the determinant function is a smooth function on the space of matrices. 



59 



Chapter 3 The Image Irradiance Equation 

3.2.2 The Invariant Image Irradiance Problem 

We can state the image irradiance problem in a form that makes an invariant 
approach to characteristic strips possible. We can define the image projection map 
as a map on C(IR 3 , 2): as before, we define w c : C(IR 3 , 2) — ► J by ir c = tt 7 o Up as 
the map W p i — ► tt (/>), where II is the standard projection of C(TR , 2) to the base 
space IR and w is the image projection. Since II o i = 7r o i, where i is the lifted 
version of the embedding i, we can write the generalized image irradiance equation 
above as 

E o tt C o i = R o i. 
Using the notation for the pullback of a function by a map, we can write this as 

i*(EOTT C ) = i*R, 



where for a real-valued function / : M — ► R, and map <f> : N — ► M, the pullback, 
<j>*f, of the map / is defined as the map (f>*f = fo<f>:N — ► R. Note that if <f> is an 
embedding of TV in M, the effect of pulling back the function / defined on all of M 
is to restrict it to the embedded N. 
By subtracting, we can write 

f(E ott c -R) = 0. 
We define the image dynamical system function 

H:C(JR\2) — >IR 

by 

H = E o tt c - R. 
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The shape from shading problem can be stated in an invariant form as follows: 
given an image and the reflectance function, we are interested in all possible lifted 
embeddings of two-dimensional surfaces, i : S — ► C(IR 3 , 2), such that 

\*H = 0, 

since these will correspond to two-dimensional surfaces embedded in IR 3 which satisfy 
Horn's image irradiance equation. 

As described in section 3.1.4, this means we are interested in all two-dimensional 
submanifolds of C(IR 3 ,2) that restrict to two-dimensional surfaces in IR 3 and that 
satisfy both i*H = Hoi — and i*0 = 0, where is a contact 1-form. This is now in 
a form that can be converted using differential forms to a vector field on C(IR 3 ,2): 
solution surfaces are drawn out by curves in C(Ht 3 ,2) which are the characteristic 
strips. We pursue this in the next chapter. 
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A3.1 Smooth overlapping of generalized (x, y ,z, P , q ) 
charts on c(]R 3 ,2) 

To see that two local diffeomorphisms (U, (a:, a)) and (U, (x, a)) really are charts 
for C(IR ,2), we need to see that their overlap is smooth. We want to know that for 
points (x 1 ,x 2 ,x 3 ,ai,a2) G (x,a)(U C\ U) the map (x,a)(x,a)~ 1 is a smooth map. 

To make this calculation, we will look at the coordinates a in a different way by 
using the space of linear functionals on tangent vectors, also called the cotangent 
vector bundle, T*IR 3 . Consider the map a : IR 2 — ► T*IR 3 :a\ — ► a a given by 

a a — aida; 1 + a 2 dx 2 - dx 3 . 

a defines a 1-form on our coordinate neighborhood. With this definition, and defining 
Null(a a ) as the null space of a, we have 



Null(a a ) = Range 



(J!). 



since the null space of a is two dimensional and the 1-form a annihilates the columns 
of the matrix. If we use a different chart, (U, (x, a)) for C(IR 3 , 2), we have a similarly 
defined 1-form a a for each a. If (x,a) and (x,a) refer to the same subspace W p € 
C(JR ,2), then we must have 

W p = Null(a a ) p = Null(a s V 
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But with 1 -forms this occurs only if the forms are linearly related on each T p JR ; 
thus we must have 

a a = a\dx + a 2 dx — dx = \(p,a)(aidx + a 2 da: — dx ) 

= A(/>,a)a a 

at p, where X(p, a) is a real number, and hence A is a function on C(1R , 2). 

We can now express a a in (x, a) coordinates so that we can compare the co- 
efficients of the cotangent basis {die 1 , die 2 , da: 3 }. One way to do this is to use the 
algebraic rules for expanding the differential form a"; another way is to observe that 
using the x coordinate bases for T p IR 3 , {d/dx 1 \ p ,d/dx 2 \ p ,d/dx 3 \ p }, we have 

(Vi 
v 2 

in coordinates. If we change the basis for T p IR 3 to {d/dx 1 \ p ,d/dx 2 \ p ,d/dx 3 \ p } we 
have 

a a (v) = (a 1 ,a 2 ,-l)X v 2 ), 

\h 

where X = [dx/dx] is the change of basis matrix for the tangent space. Let 

(b,c,d) = (ai,a 2 ,-l)X, 

then we must have 

a a = bdx 1 + cdx 2 + ddx 3 = \(p, a)(aidi 1 + a 2 dx 2 - dx 3 ), 

and, comparing coefficients of dic^djr 2 , and dx 3 we must have X(p, a) = — d ^ 0, 
a\ — —b/d, a,2 = —c/d. Since the operations used to get a\ and 5 2 in terms of a: and 
a are locally smooth in nature, and the transition function x \- > x is smooth, the 
whole transition function (x,a) i-» (x, a) is locally smooth as well. 
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A3. 2 Contact 1-forms 

The definition of a contact 1-form in a lifted coordinate neighborhood ([/, (x, a)) 
of C(IR ,2) is closely connected to the coordinate work in Section A3.1 above. We 
define a contact 1-form on this neighborhood as 

Q(x, a ) = aids 1 + e^dx 2 — dx 3 
in coordinates. 9 is related to a a defined in Appendix Section A3.1: 



o ix ,a) = (n c )*(* a ) = « a oDn c , 



where Tl c : C(IR 3 , 2) — > IR 3 : W p i — ► p is the standard projection of C(IR 3 , 2), and 
a a is the 1-form defined earlier on IR 3 . This is clearly dependent on the local chart 
used; note also that is a 1-form on C(IR 3 , 2), not on IR 3 — the coefficients of dai 
and da2 happen to be 0. 

At a point p € Uf\U C C(]R 3 , 2) with coordinates (x, a) and (x, a) in the different 
coordinate charts, we have from Appendix Section A3.1 that 

a a = \a a , 



so 



= (n c ) *(a«)) = (n c ) *(Aa a ) = X0, 

where A is a real- valued function on C(IR 3 ,2) as before. Thus if and are two 
contact 1-forms defined on overlapping regions of coordinate charts (U,(x,a)) and 
(U,(x,a)) respectively, then on the overlap, the 1-forms are linearly related in the 
sense that there is a function f :U C\U C C(IR 3 , 2) — ► IR such that 

= f0 
on the overlapping set. 
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Assume we have an atlas of charts, {(U Qy (x,a) a )} for C(IR 3 ,2) with a corre- 
sponding partition of unity p a . We can define the differential ideal I generated 
by all p a Q , where each a is the contact 1-form defined on the coordinate chart 
(U a , (x, a) a ). A differential ideal i" generated by differential forms {u>i, w 2 , . . . , u k } of 
a manifold is the graded subalgebra of differential forms 

I = \ 2_^ Pi A u>i\ pi are any differential forms on the manifold > . 

/is a graded vector space of differential forms with the additional A operation; it also 
has the property that for any differential form p on the manifold and any differential 
form 7 in J, p A 7 is also in J. This is the property that makes / an ideal in the 
algebraic sense. 1 

If is another contact 1-form defined using another coordinate chart (U, (x,a)), 
then from above we know that on every non-empty U f\U a we have 



= f a 9, 



ai 



where f a is a function defined on U D U a . Multiplying by the partition of unity we 
generate forms p a 6 and p a f a a that are defined (although often zero) on all of U. 
Summing over a we have 

Y /Pa = = y,p«uk 

a a 

on U, since Y, a Pa = 1 as {p a } is a partition of unity. Thus any contact 1-form 
defined on an open set U C C(1R 3 ,2) is in the ideal I in the sense that there exist 
functions g a : U — ► IR such that = £ a g a a on U. 



1 For detaUs, try (Edelen, 1985). 
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Chapter 4 

The Image Dynamical System 



We now make the transition from the image irradiance equation to the image 
dynamical system. The image dynamical system is essentially the characteristic 
vector field defining the characteristic strips in the classical solution of first order 
partial differential equations. If F(x, y, z, p,q) = defines a classical first order 
partial differential equation for the function z(x, y) with z x = p and z y = q, then the 
method of characteristics is to solve the system of ordinary differential equations 

X = r p 

y = F q 

z = pF p + qF q 
p = -(F x + pF z ) 
q=-(F z + qF z ). 

A solution to this system starting from some point is called a characteristic strip, 
where p(t) and q(t) give the orientation of the surface at each point (x(t),y(t),z(t)) 
of the curve. Solutions to the partial differential equation are made up of these char- 
acteristic strips; for example, if one chooses an initial strip (i.e. points on a curve 
together with surface orientations consistent with this curve) crossing the charac- 
teristic strips, the characteristics will draw out a solution surface for the partial 
differential equation beginning at the initial curve. 
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Systems of ordinary differential equations like the characteristic strip equations 
can be thought of as vector fields. Solution curves for the system of ordinary differ- 
ential equations are trajectories that are tangent to the vector field everywhere along 
their length. 

The first part of the mathematical preliminaries section in this chapter shows how 
the characteristic vector field can be described invariantly; we can then use different 
coordinate systems depending on the task we are trying to accomplish. The material 
is more technical than what has come before, relying on ideals of differential forms 
and the Frobenius theorem; it is based on (Edelen, 1985). The main reason for going 
through the derivation this way is to allow the vector field to be easily computed using 
differential forms in very unusual coordinate systems. We will need this for Chapter 
6, where we study the image information at the bounding contour by choosing a 
very special coordinate system in which to write the image irradiance equation. The 
results of Chapter 5 and Chapter 6 do not depend on understanding the derivation 
in Section 4.1.1 in detail. The important idea, that the image irradiance equation 
can be analyzed by integrating a vector field, was used by Horn (Horn, 1975) and is 
at the heart of all the work reported here. 

In the second part of the mathematical preliminaries section, we discuss some of 
the concepts behind the study of dynamical systems, essentially the study of vector 
fields and the trajectories that are integral curves for them. We define the image 
dynamical system as the dynamical system associated with the invariant character- 
istic vector field of the image irradiance equation. We also show that in the case 
of orthographic projection with space invariant reflectance function and the usual 
( x ,V,z,P,q) coordinate system on C(IR 3 ,2), the invariant image dynamical system 
is the same as Horn's characteristic strip equations. In the next chapter we will be- 
gin examining the image dynamical system with a technique from dynamical systems 
analysis, looking at the behavior near critical points. 
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4.1 Mathematical Preliminaries 



4.1.1 From First Order PDE to Vector Field 

We begin by looking at some abstract results on ideals of differential forms and 
vector fields. The approach taken here is algebraic, relying on certain operations 
defined to act on differential forms, such as the Lie derivative, the contraction (also 
called the interior product), and the wedge product. For details of definition and 
properties of these operations which give differential form analysis its peculiar effec- 
tiveness, we refer to (Abraham, Marsden, and Ratiu, 1983) and to (Edelen, 1985) 
from which much of this material was adapted. 

4.1.1.1 Differential Ideals and the Frobenius Theorem 

In the last chapter we discussed the notion of a differential ideal of differential 
forms as a vector space of differential forms which is also an ideal under the wedge 
product of differential forms. We define a characteristic vector field v for an ideal / 
on the manifold M to be a vector field such that for all u in /, the contraction i v u> 
is also in /; this is written as i v 7 C 1} 

The set of all characteristic vector fields, V, for an ideal / is a module (essentially 
a vector space) over the set of real- valued functions on the manifold: that is, if v is a 
characteristic vector field, so is /v, where / is a function: this is because i/ v = /W- 
There is an operation called the Lie bracket on pairs of vector fields which generates 
another vector field: [u,v] is defined as L u v, where L is the Lie derivative with 



The differential form i v o> is the contraction of the differential form u> by the vector field v, defined as 

i v w(v 2 , . . . , v m ) = w(v, v 2 , . . . , v m ), 

so that i v w is a differential form of one degree lower than u. By useful convention, i v / is taken to be 
where / is a 0-form, i.e. a function. The contraction operation takes a k-form and generates a (k-l)-form 
by plugging in a vector field into the first position of the k-form. 
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respect to the vector field u. 2 The Lie algebra of vector fields over the real numbers 
is defined as the vector space of vector fields together with the Lie bracket as a 
product operation on vector fields. If / is a closed ideal, meaning for all 9 G / we 
have d$ 6 I 3 (also written dl C J), then the space of characteristic vector fields V 
is a Lie subalgebra of the Lie algebra of vector fields over the real numbers. 4 To see 
this, we first use one of Cartan's "magic formulas": for any differential form u>, and 
vector field v, 

L v u> = i v du; + di v u>, 

where L v u> is the Lie derivative of u with respect to the vector field v. 5 If v G V is 
a characteristic vector field for the closed ideal J and u> is in /, this says that L v o; is 



The Lie derivative operation, L, can be invariantly denned to operate on tensors, including vector 
fields, functions, and differential forms. It represents a kind of derivative along the flow of the vector field: 
one definition is that 

Lvu- = l[(*i)* W ]| t= o, 

where Ft is the flow of the vector field v (flows of vector fields are discussed in more detail in Section 4.1.2): 
since D(F t ) maps T P M to T q M where q = F t (p), the pullback, (Ft)* takes structures on T q M and pulls 
them back to T P M for all t Thus, (F t )*w is a curve of (linear) structures on the vector space T P M for all 
time, and its time derivative will also be a linear structure on T P M. By pulling elements along the flow line 
of v back to T P M we can take a derivative without having to move to a new and more complicated tangent 
space: we take advantage of the linear structure of T P M. 

If w is a k-form, then dw is a (k+l)-form which is in some sense a derivative of u>. The operation d 
can be invariantly defined; however, it may be easiest to think of it in coordinates: if 



w= ^2 fi x ik dx l1 A...M' 



«1<- <»Je 

is the coordinatized version of w in the chart (i 1 , . . . , i"), then we define 

dco= ^2 df ilt ... iik Adx il A...Ax ik , 

where dg = 3^-dx 1 + h $£rdx n is defined as the differential of the function g. The operator d has 

various useful properties in combination with the contraction operation, the Lie derivative, and pull-backs: 
for example, d(4>*u>) = </>*(dui), so that d commutes with pullback. See (Abraham, Marsden, and Ratiu, 
1983) and (Edelen, 1985) for more details. 

This means that V as a linear subspace of the vector fields turns out to be a Lie algebra on its own if 
/ is closed: if u and v are vector fields in V, then so is the Lie product [u, v] defined as L u v. 

For this and other properties, see (Abraham, Marsden, and Ratiu, 1983). 
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also in the ideal 7: because v is a characteristic vector field, i y u> is in J; because J is 
closed, du and di v uj are also in J, and finally J is a vector space. If we have v and u 
as vector fields on the manifold, and u a differential form, we also have the identity 

Lu(iv^) = i(L u v) w + iv(L u u>); 

essentially, the Lie derivative treats i v w as a product, and i v u> is sometimes referred 
to as the interior product of v and u. Rearranging this identity, for u and v both 
characteristic vector fields of the closed ideal 7, we have 

i(L u v)<^ = i[u,v] w = L u (i v w) - i v (L u w) <E 7, 

and so [u, v] is also a characteristic vector field. 

The Frobenius theorem for vector fields asserts that a Lie subalgebra of vector 
fields is integrable; i.e., through each point p of M there are submanifolds i : N — ► M 
of the same dimension as the Lie subalgebra, V, such that the Lie subalgebra V spans 
the tangent space to i(N) at each point p ofN. In our case, V is the space spanned by 
the characteristic vector fields. Let us assume the characteristic vector fields for the 
closed ideal 7 span a space of dimension r at each point; then an integral manifold 
N of the Lie subalgebra of characteristic vectors V has dimension r as well. 

The integral submanifold i : N — ► M solves the ideal I in the sense that for all 

oj in 7, i*cj = 0. To see this, assume u> is a k-form. If Uj, U2, . . . , u^ are in T p N, we 

have 

(»*w)(ui, u 2 , . . . , u fc ) = w(Di(ui), Di(u 2 ), . . . , Di(u k )) 

= U>(vi,V 2 ,...,Vjfc), 

where v^ = Di(ui) must be characteristic vectors for 7 since they are in the tangent 
space of the integral submanifold N for V. We know that i v u> € 7 for any u> € 7, 
and since 

i vt . . . i V2 i Vl ^ = w(vi, v 2 , . . . , v fc ), 
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we know that this 0-form (or real-valued function) must be in I. The only 0-form in 
J is the function, since I is assumed to be generated by forms of degree 1 or higher; 
hence, i*u> = 0. 

If we begin with a differential ideal I that is not closed, we can augment / by 
adding in all the differential forms dl to make a new ideal I. Since I G I, solution 
surfaces for I will be solution surfaces for I. Solution surfaces for I are also solution 
surfaces for I. If i : N — > M is a solution suface for I, then we know i*0 = for 
any £ I. This means d(i*6) = i*(d0) = 0, and as a result, i*0 = for any £ I. 
Hence, N is a solution surface for I as well. 

In our case, the base manifold M is really C(IR 3 , 2), a five-dimensional manifold. 
We begin with the ideal I generated by the connection 1-form ideal I and the 1-form 
dH , where if is a real function, H : C(JR , 2) — > IR, the image dynamical system 
function. We find the closure, I, of this ideal by including all the forms d#, where 
9 6 / is a connection 1-form. A solution surface for our shape from shading problem 
will be a submanifold that solves this ideal and on which H has value 0: solving 
the ideal means finding submanifolds i : N —> C(IR 3 ,2) such that i*dH = and 
i*0 = 0, so that i* H is a constant (we are interested only in those with i*H = 0). 
The submanifolds are potentially lifted surfaces from IR 3 . Note that if just one point 
p on such a solution submanifold has H(p) = 0, then the whole surface obeys this 
constraint. 

This ideal for the shape from shading problem has a characteristic vector space of 
dimension one in general. This is a result of the constraints on a characteristic vector 
field: if v is a characteristic vector for I, then from the definition of a characteristic 
vector i v # must be a 0-form, or real- valued function, in I. The only 0-form in I is the 
zero function, so i v = for all connection 1 -forms 0. We must also have i v dH = 
for the same reason. Finally, we must have i v d# — a0 + bdH, since these are the only 
possible 1-forms in I (up to function multiplication). We can look at the tangent 
space T p C(JR , 2) at a point p and see how these three constraints on a characteristic 
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vector restrict the possible space of characteristic vectors at p to a one-dimensional 
space. We do this in the Appendix to this chapter. Since V has dimension 1, we have 
one-dimensional submanifolds i : iV" — »■ C(IR , 2) on which i*0 = and i*dH = 0. 
These are the characteristic strips of the classical method. 

4.1.1.2 Isovector fields and extension of solutions 

Another kind of vector field associated with an ideal is the set of isovector fields 
of the ideal. A vector field v is an isovector field for an ideal J if for all in J the 
differential form L v 0, the Lie derivative of with respect to v, is also in I. This is 
also written "L V I C J. 

For a closed ideal I, i.e. d/ C 7, the characteristic vector fields of I are also 
isovector fields: this was essentially shown above, since for a characteristic vector 
field v 

L v = i v d0 + di v € I 

because I is closed and i v J C / by definition. Isovector fields of an ideal provide a 
way of extending solutions to the ideal: if i : N — ► M is a solution surface (not 
parallel to any of the characteristic vectors along it) for the ideal I so that i*I = 0, 
and v is an isovector field for I, then we have a new solution for the ideal of one 
higher dimension given by 

j : N x JR — ► M 

j : (p,s) i — >F s oi(p), 

where F s : M — ► M is the flow on M generated by the vector field v (see Section 
4.1.2 for more on flows due to vector fields): 

5j-F,(p)L=o = v(p) 

Fo(p) = P- 
We show this in the Appendix to this chapter. Essentially, we use the flow lines 
generated by the isovector field to take points on a lower dimensional solution and 
extend them to a higher dimensional solution. 
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Given an isovector field for an ideal and a solution surface for the ideal, we can 
generate higher dimensional solutions for the ideal if the lower-dimensional solution 
does not contain the isovector field in its tangent space. In the case of characteristic 
vectors for the closed ideal generated by the contact 1-form 9 and the function 
H : C(JR , 2) — » IR, this is exactly the classical method of characteristics for finding 
solution surfaces of the first order partial differential equation H = given a curve of 
initial data; we use the flow due to the characteristic vector field to extend an initial 
solution curve. Using differential forms the method is extended beyond the range of 
a single coordinate chart. 6 

4.1.2 Vector Fields and Flows as Dynamical Systems 

As we saw in Chapter 2, every point p on a manifold M has a vector space 
associated with it, T p M, and these vector spaces can be bundled together into a new 
manifold called the tangent bundle, TM. 7 If we choose an element v(p) G T p M in 
each vector space for each point p on the manifold we have a vector field on the 



There is another approach to this kind of problem: instead of focusing on the space C(IR 3 , 2) of tangent 
planes, we can look at the space of orientation vectors for tangent planes. It turns out to be conveniant to 
use the dual space to the tangent bundle for this purpose, where the dual space T*IR 3 consists of all possible 
linear functional acting on the tangent spaces TpIR 3 . We can generate a Hamiltonian problem on T*JR 3 
corresponding to the characteristic problem on C(IR 3 , 2) using the symplectic 2-form u> = dx A dp + dy A dq 
on T*TR. ; the contact 1-form on C(IR 3 , 2) essentially becomes the canonical 1-form on T*IR 3 . Trajectories 
of this Hamiltonian dynamical system are again related to characteristic trajectories; note that the problem 
is now set in a six dimensional space. 

We can generalize the idea of the tangent bundle to the notion of a vector bundle: we have two 
manifolds related by a map w : E — ► M, such that w~ 1 (p) is a vector space (isomorphic to some constant 
vector space V for all the different p). We then speak about a map v : M —* E as a section of E if 
7ro v(p) = p; in other words v picks out a vector in the vector space connected to p. In our case of a tangent 
bundle, we have the map n : TM — ► M which takes any vector v € T P M C TM to its base point p £ M , 
and TpM = x~ (p), so the tangent bundle is a vector bundle, and a vector field is a section of this bundle. 
The idea of a vector bundle can be used to construct lots of linear structures and tools for working with 
tangent bundles: typically tools from linear algebra form vector spaces, so we can put a copy of each such 
vector space "above" each point on the manifold to help work on the tangent space at each point. 
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manifold. Another way to think of this is as a map v : M — > TM taking a point p 
to a vector v(p) e T P M C TM. 8 

We can pick coordinates to examine the detailed behavior of a vector field: if 
we take a chart (U, <f>) for M around a point p G M, then at p the intrinsic vector 
v(p) = [((f>, v)] has a coordinate representative v G IR n using the chart <j>. As 
discussed in Chapter 2, one way to find this coordinate representative is by using 
paths: say w(p) can be thought of as the tangent vector at time t of the path 
a : IR — ► M. The coordinate view of this path, <f> o a, will have v as its derivative at 
t = 0: using the derivative map, we have D{<j) o a) t (l) = (</> o a)'(t), so 

D((j> o a)t(l) = D<j)o Da t (l) = £><£ o v(p). 

It turns out that for a vector field X : M — ► TM we can find paths a(t) such 
that a'(t) = X(a(t)) for an entire interval of i's; in other words, at each point 
along the path, the time derivative of a is the same as the value of the vector 
field at the point on the path. In fact, we can find an entire family of such paths, 
F(x,t) : M x IR -» M such that ^F(x,t) = X(F(x,t)), and with the following 
additional properties: F(x, 0) = x, so that the path F(x, t) can be thought of as 
starting from x at t = 0, F t (F s (x)) = F t+S (x), where F t (x) = F(t,x), F is smooth, 
and under certain mild conditions, F is unique. F is called the flow of the vector 
field X, and the properties about it are proved using the fundamental existence 
theorem for ordinary differential equations on IR n : essentially, we can use charts to 
convert the problem into problems on patches of IR™ and tie these together. 9 If the 



Note that even though we are actually making statements about mappings from the manifold to a set 
of equivalence classes, we can quite happily ignore this formal "complication" and reason as if we were still 
at home in IR . Even cleaning up the details does not make much reference to the underlying equivalence 
classes; one speaks about taking coordinate charts and applying them to vectors and points. 

Again, for technical details see (Abraham, Marsden, and Ratiu, 1983). 
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Figure 4.1. The pendulum dynamical system: a. The physical system, b. The flow lines; the vector field 
is tangent to the flow line at each point. 

vector field is thought of as a field of velocity vectors for a fluid, the flow tells how 
a lightweight particle in the fluid would move over time. 

A vector field on a manifold is called a dynamical system. The flow lines Ft(x) 
considered as a function of t are the trajectories of the system. As an example 
of such a dynamical system, consider the simplified equation for a two-dimensional 
pendulum: 



ii 
d* 2 



9 = -sin0. 



If we take x = 9 and y = 6, we can write this second order equation as a pair of first 

order equations: 

x = y 

y = — sin a;. 
We can package x and y together and consider the vector field 



X(x,y) = 



y 

sin a; 



as a vector field on IR 2 . 10 A flow for this vector field will be a map F : IR 2 x IR — > IR 2 



We could be more careful and consider it as a vector field on the manifold TS 1 , the tangent bundle 
of the unit circle; the angle is a chart for the circle, and the angle and angular velocity together give a 
tangent vector to the circle, i.e. a point of TS 1 . The vector field tells the time derivative of the position 
and velocity at each point of TS 1 . 
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such that 

±Ft(p)\t=o = X(p); 

if we consider (x(t), y(t)) = F t (x(0), y(0)) to be a path along the vector field starting 
at (a:(0),?/(0)), then we get solutions to the first order differential equation (Figure 
4.1). 

Part of the task of studying a dynamical system is to connect features of the 
vector field with features of the flow. An important feature of a vector field is a 
critical point, a point pGM where the vector field is the vector of T p M . p is then 
a fixed point of the flow F, that is F(p, t) = p for all t: the flow "line" Ft(p) = p is 
consistent with the vector field, since ~ft(Ft(p)) = 0. For example, in the pendulum 
case, both the origin (0,0) and the point (tt,0) are critical points: X = at these 
points. The behavior of the flow lines around these two points is quite different: at 
the origin, the flow lines form circles around the critical point describing how the 
pendulum oscillates back and forth around the origin, with velocity about ninety 
degrees out of phase with position. At (w, 0) we have a very different picture: we 
seem to have two flow lines intersecting at the critical point. In fact there are 
four trajectories that approach infinitely close to the critical point at (7r,0). One 
pair of these are trajectories for the pendulum that become infinitely slow at the 
top of the pendulum's arc (6 = 7r), and the other pair are trajectories which begin 
infinitely close to the top of the arc with near zero speed and begin to move away with 
increasing speed. Other flow lines run near these four trajectories without touching 
the critical point. We shall have much more to say about critical points and their 
classification in the next chapter. 

One of the questions that comes up in the study of a dynamical system is that 
of stability: what features of a particular dynamical system remain even if the sys- 
tem is slightly perturbed, e.g., the vector field is changed a bit? There are several 
ways to define what is meant by a dynamical system remaining substantially "un- 
changed" : we shall consider two dynamical systems to be related if the flow lines are 
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topologically equivalent in the two systems, i.e. there is a homeomorphism from one 
dynamical system to the other which maps flow lines to flow lines. 

Part of the difficulty in answering questions about stability is in defining what 
sorts of perturbations we are interested in. For example, if we are studying a partic- 
ular class of vector fields (Hamiltonian vector fields, for example) we have to decide 
if we are interested in questions of stability in the set of all Hamiltonian vector fields, 
or stability in the set of all smooth vector fields: the difference can be important. In 
the pendulum example, if we modify the vector field by adding an arbitrary small 
smooth vector field to all the points, it turns out that the circular closed orbits around 
the origin will in all likelihood disappear and become trajectories that spiral in or 
spiral out from the critical point (friction will make the trajectories spiral inwards, 
for example); the character of the trajectories near (0, w) will be unaffected. On 
the other hand, if we restrict ourselves to perturbations that give energy conserving 
Hamiltonian dynamical systems "near" the original, it turns out the closed orbits 
near the origin are a stable feature of the vector field. 

A perturbation approach can be useful in understanding the computational feasi- 
bility of a particular dynamical system. If one actually tries to implement a dynam- 
ical system on a computer, one has to face the difficulties of discretizing a smooth 
theoretical construction. One way to model this is as a (presumably small) per- 
turbation from the original problem to the discretized one. Unless special care is 
taken, this discretization perturbation will not be very forgiving of restrictions on 
the class of dynamical systems in which a feature is stable: it may be quite dif- 
ficult to accurately study features that are unstable under relatively general small 
perturbations. 

The computational problem is one kind of perturbation stability to be concerned 
about. Another comes from the lack of exact knowledge about the dynamical system 
itself. In the image dynamical system we will construct, we assume we know the 
reflectance function. It is of interest to know how features of the dynamical system 
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change if we have made a small mistake in this regard. This is different from the 
computational perturbation in that the perturbation of the reflectance function may 
generate a quite restricted perturbation of the vector field due to the special form of 
the vector field. This sort of perturbation study is an open question for the image 
dynamical system. 

4.2 The Image Irradiance Dynamical System 

As indicated last chapter, we can summarize the image solution problem as the 
question of finding lifted solution surfaces i : S — ► C(JR 3 , 2) such that 

i*# = 0. 

We can consider this problem to be that of finding all two-dimensional submanifolds 
projecting to two-dimensional manifolds in ]R 3 and such that \* H = and i*0 = 0, 
where is any contact 1-form. 

The function H = E o x c - R defined in Chapter 3 essentially determines a first 
order partial differential equation. As described above, we can use contact 1 -forms 
and the calculus of differential forms together with the Frobenius theorem to convert 
the above problem into a vector field integration problem. The characteristic vector 
field derived in the last section is the same as the characteristic vector field of Horn, 
but is not tied to a particular coordinate representation. 

We can show that the characteristic vector field is the same as Horn's charac- 
teristic strip equations in a particular coordinate system. We will work within the 
following framework: we assume orthogonal projection and global rectilinear coordi- 
nates for IR and the image plane such that in coordinates we have 

^(x.y.z) = (x,y). 

We also assume a space invariant reflectance function so that R(x, y, z, p, q) = R(p, q). 
We want to find the components of a characteristic vector field X. We have the 
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constraint ixdO = a0 + bdH. In coordinates, we have = dz - pdx - qdy, dO = 
dx Adp + dy A dq, and dH - E x dx + E y dy - R p dp - R q dq. Using the definition of 
contraction acting on the wedge product, 11 the constraint ixd# = aO + bdH on the 
characteristic vector field X = (X x ,X y ,X 2 ,X p ,X g ) in coordinates becomes: 

X x dp — X p dx+Xydq — X q dy 

= a(dz - pdx - qdy) + b(E x dx + E y dy - R p dp - R q dq). 

Gathering together coefficients of the basis 1-forms dx, dy, dz, dp, dq, we have 

a = 

X x = —bR p 

X y = —bRg 

Xp = —bE x 

Xq = —bRy. 

We also have ixO = 0; evaluated in coordinates this contraction becomes X z — pX x — 
qX y = 0. Thus, in coordinates our vector field X looks like 

Rr> 



X = -b 



Rq 

pR p + qR q 

E x 



where b is an arbitrary constant. An integral path for such a vector field with b = -1 
would have 





x' 




R p 


d 


y 




Rq 


d7 


z 


= 


pRp + qR 




p 




E x 




■ 9- 




E v 



these are the characteristic strip equations as derived by Horn (Horn, 1975). 



It can be shown that 



iv(<? A w) = (i v #) A w + (-l) k A (i v w), 



where is a k-form: see (Abraham, Marsden, and Ratiu, 1983). 
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We can view the shape from shading problem as a dynamics problem on C(TR. , 2), 
where the dynamics are defined by the characteristic vector field. The dynamical tra- 
jectories of the system essentially draw out characteristic strips on possible solution 
surfaces. A possible solution surface is therefore an invariant surface of the char- 
acteristic flow, in the sense that points on the surface move to other points on the 
surface under the characteristic flow. 

As Horn recognized, some way to choose a subset of the characteristic trajectory 
strips is needed: they fill a volume of C(IR , 2). One way might be the specifica- 
tion of an initial non-characteristic trajectory from which to draw the rest of the 
surface using characteristic trajectories: this is the classical Cauchy problem of first 
order partial differential equations. Unfortunately, we do not usually have such an 
initial strip given. How can we reduce the ambiguity in the choice of characteristic 
trajectories that will make up a solution surface? 

One way to look at the ambiguity of solution surfaces is to consider how many 
solution surface patches are consistent with a small image patch and the given re- 
flectance function. We can examine patches without critical points or bounding con- 
tour points, patches with just critical points, or patches with just bounding contour 
points, for example. 

If the image patch does not contain critical points of the image intensity or any 
bounding contour elements, then we can take a curve (x(t),y(t)) in the image patch 
and specify depth values z(t) along it. The image irradiance equation E(x,y) = 
R(p, q) and the contact 1-form integrability condition z' = px'+qy' will almost always 
lift this curve in IR 3 to a curve in C(IR 3 ,2) by determining p and q. If this curve is 
not tangent to any of the characteristic curves (this is one reason critical points have 
to be excluded), then a solution patch will be drawn out by the characteristic curves 
from this initial curve in C(IR 3 ,2). The ambiguity of smooth solution surfaces for 
such an image patch can be summed up by the smooth specification of depth values 
along a given path in the image. 
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There may be problems with such solutions as they are extended beyond the 
bounds of a small patch. One kind of problem that may arise is the folding of 
the surface "unnaturally" in C(1R 3 ,2) in such a way that no possible space surface 
could have generated it without self-intersections or other strange behavior we do 
not expect for a surface consistent with a smooth image. This is an open area for 
exploration. 

We look at ambiguities of an image patch which contains a critical point in 
the next chapter. The use of critical points to help determine solution surfaces 
was first explored by Horn (Horn 1975) who used the critical point to generate an 
approximate initial contour on the surface from which to draw the rest of the surface 
with characteristics. Bruss (Bruss 1980) made further theoretical progress for a 
particular class of reflectance functions. 

Critical points in the image due solely to critical points in the reflectance function 
determine critical points of the characteristic vector field on C(JR 3 ,2). The key idea 
exploited in the next chapter is that a possible solution surface for a smooth region 
of an image containing a critical point should be a smooth, invariant manifold of 
the image dynamical system containing the corresponding critical point of the image 
dynamical system. As we shall see, in general this provides strong constraints on 
what the behavior of the surface can be, since most characteristic curves near a 
critical point will not actually approach the critical point. 

In Chapter 6 we look at ambiguities in solution surfaces for an image patch 
containing a piece of bounding contour. As indicated in Chapter 2, the bounding 
contour is a curve in the image for which we actually know the surface normal 
direction at each point. We will suggest that local patches of bounding contour 
image data provide more of a constraint on the reflectance function than on the local 
behavior of the surface near the bounding contour: given the correct reflectance 
function, we hypothesize (and show to third order) that the surface is determined 
only up to a choice of depth values along the bounding contour in the image, very 
similar to an image patch in the interior without critical points. 
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By examining the tack of extracting shape ialonoAtion from shading in an image 
with geometric took, one hopes to mahedtar tise^eiM^i^eas of different sosrces of 
information to potential sedations. HopefnJfoona can •— ante precisely why certain 
assumptions are required, are useful, or are riwnnshtu hf choosing an interpretation 
of an image. 
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A4.1 The dimension of the characteristic subspace 

We show here that the characteristic subspace of the closed ideal generated by 
the contact ideal and the function H : C(IR 3 , 2) — > Ht is in general one dimensional. 

One way to do this is to look at the coordinate representation of these constraints 
using matrix notation. Pick a coordinate neighborhood (U, (x,a)) for C(JR 3 , 2) used 
to define a connection 1-form 9. In this coordinate system we can write the 1-forms 
9 and dH as 

9 = da; 3 — aidx 1 — a 2 dx 2 
dH = Hidx 1 + H 2 dx 2 + H 3 dx 3 + H 4 d ai + # 5 da 2 . 

The 2-form d# is an antisymmetric 2-tensor at p, and so can be represented as an 
antisymmetric 5x5 matrix, A. In this coordinate system we have d# = da; 1 A dai + 
da; 2 A d<22, so in matrix form we can write 



where 



■ 





-1 

. -1 


1 







1 





d0(u,v) = 


- u Av 
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using the coordinate vector representations for u and v. As row vectors in this 
coordinate system we can write 

6 = [-a 1 ,-a 2t 1,0,0] 
dH = [H 1 ,H 2y H 3 ,H Al H s \. 

Two conditions for a characteristic vector v, #(v) = and di/(v) = 0, can be 
combined and rewritten in matrix form as 



Wvi 



— a\ —CL2, 10 
#1 H 2 H z H 4 # 5 



V 5 A 



= 0. 



The condition i v d# = aO + bdH can also be written in vector form in the following 
way: there exists a 2- vector u € IR 2 such that 



v T A = u T 



—a\ — <Z2, 10 
H\ H2 Hz Hi H$ 



= u T W. 



Notice that the rows of the matrix on the right are the representations of and dH. 
The assertion of the matrix statement is that the 1-form i v d# (represented by the row 
vector v T A) is a linear combination of 9 and dH (represented as row vectors) with 
coefficients given by the 2- vector u. By making an augmented row vector [v T , u T ], 
we can write this condition as 



[v T ,u T ] 



A 
-W 



= 0. 



We can add the two 1-form conditions to this expression by augmenting the 
matrix: 



[v T ,u r ] 



A W r 
-W 



= [0,0], 



If we find an augmented non-zero vector [v T , u T ] such that this condition is obeyed, 
then the v part obeys the conditions for a characteristic vector. 
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Since the augmented 7x7 matrix 



A^ 



A 


W T ' 












-W 















' 








1 





-«i 


Hi 














1 


-a 2 


H 2 

















1 


Hz 


-1 

















Hi 





-1 














E 5 


a\ 


«2 


-1 














-Hi 


-H 2 


-H 3 


-H A 


-H 5 





. 



is antisymmetric, it has maximum possible rank 6 (an antisymmetric matrix always 
has even rank); by examining the columns of the matrix, one can see that the matrix 
always has rank 6. There is always a one dimensional subspace of augmented vectors 
[v T , u T ] such that [v T , u T ]A = 0. If the v part of an augmented vector in this 
subspace is non-zero, then it is a characteristic vector, since by the above construction 
i v = € I for connection 1-forms 0, i y dE = € I, and i v d# = aO + bdE 6 I, so 

i v i c i. 

When will the null space of A have a v component that is zero? For this to 
happen we must have u T W = 0, which means the two 1-forms and dE are a 
dependent set. This could happen either because they are linear multiples of each 
other, or when dE = 0. Assuming the coordinate representation used previously for 
image projection w 1 : JR. 3 — ► I : (a; 1 ,^ 2 ,* 3 ) i — ► (a; 1 , a; 2 ), and using the definition 
of E, dE = describes a kind of picture function extremum which occurs when 

E x i = R x i 

E x 2 = R x 2 

R x 3 = R ai = R a2 = 0. 

In the space invariant reflectance function case where R x i = R x 2 = R x s = 0, this 
means that an extremum has occured in both the image brightnesses and the image 
reflectance function. The case where dE and are dependent also reduces to this 
case if the reflectance function is space-invariant. 
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For the most part, the v component of the augmented vector will be non-zero 
and therefore there will be a one-dimensional set of characteristic vectors for the 
ideal I. 

A4.2 Extension of solutions by isovector field flow 

We show here that if F s is the flow of an isovector field v of a differential ideal 
7, then the embedding j : N x IR -+ M given by j(p, s) — F s (i(p) extends a solution 
surface i : N — > M for the ideal I by one dimension. 

To see that j is a solution to the ideal, we observe that 

j*$ = i*F s *0. 
It turns out that F S *I C I : one way to see this is to observe that 

j;(F.*9)\^ = ±(F s , +t *0)\ t=o 

= ^Ft*(Fs.*e)\ t=0 
= L v (F s i $), 

since (d/dt)Ft u>\t=o = L v w, the Lie derivative of u> with respect to v, for any 
differential form u>. At s = we have 

F s #U=o = Q- 

If we consider these two as a first order differential equation on the vector space 
(here thought of as over the real numbers) of differential forms, i.e. 

j 
— p(s) = L v p(s) 

P (0) = 9, 

where L v is now a linear operator on the real vector space of differential forms, we 

have the operator series solution given as 

/ oo 
A 



p(s) = exp( 5 L v ) o0= [J2 s% 
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By uniqueness of the solution p(s) to the differential equation with initial condition, 
we must have 

F s *6 = exp(sL v ) o 6. 

We have assumed that L v / ^ I since v is an isovector field; thus, the series expression 




must also remain in I for all in I, and so F*I C I. 1 

Since i is a solution for I we have i*I = 0; since F S *I C 7, we have i*F s *I — 0, 
and hence j(s,p) = F s (i(p)) is a new solution for the ideal I. In order for j to be an 
embedding Dj must have full rank. To calculate Dj we observe that if we consider 
the flow F s of the vector field v as a map F :JRx M — > M : (s,p) i — ► F s (p), then 

DF:TRxTM — ► TM 



DF: 



[DiF,D 2 F] 



Another way to see this is to consider the space of all differential k-forms on M as an infinite 
dimensional manifold, M, with individual forms 9 being points on the manifold. It so happens that this 
space is a vector space: we can add points in the space, and multiply by scalars (real numbers here). 
The tangent space, TgM, to a point 9 in this manifold is therefore equivalent to the original space itself, 
TgM ~ M, and at each point TgM will also be infinite dimensional. A tangent vector can be considered 
as an ordered pair of k-forms, e.g. (9, p) € TgM, with 6 being the base point, and p giving the tangent 
vector representative in M. A vector field on this new manifold consists of a choice of tangent vector for 
each base point in M. We can consider L v to be a vector field: for each point 9 in M, we consider L v 
to pick out the tangent vector (0,L V 0). We can now properly consider the flow of this vector field on the 
manifold M, T„ : M — ► M where 

^r,(8)\,=o = (L v )(9) = (9,L V 9) 

r (9) = e 

defines the flow for the vector field L v on the infinite dimensional manifold A4. If the vector field L v 
restricted to a submanifold I of M lies completely in the tangent space to /, then we know that the 
induced flow of a point on the submanifold I must also lie in this submanifold. In our case, I is the set of 
k-forms in the ideal I, and is in fact a subspace of M. The vector field L v at a point 9 £ I is also in I when 
v is an isovector field, since L v € /. The flow is the flow T, , and so T, (9) £ I. By uniqueness of solution, 
we know F s (0) = F,*9, and so F s * I C I. 
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where we have split the derivative DF into two components, the first with respect 
to flow time and the second with respect to location on the manifold. From the flow 
equation we know that 

D x F {s , p) = v{F{s,p)), 

where (s,p) 6 IR x M. At (s,p) we have 

v(F„(p)) = ±(F t o F.)\ M = ^r(F s o F t )\u* = D 2 F {S!P) o v(p). 



dr 



Since j(s,p) = F o (s,i(p)), we have 



df 



Dj( s , p) 



r 
u 


= 


D\F D 2 F 




1 






Di 


r 
u 




= 


DiF D 2 F o Di 




r 
ii 






J L *■* J 


r n 


= 


v(F(s,i(p))) D 2 FoDi p 


r 
u 




'■ -, 




= 


D 2 Fov(i(p)) D 2 FoDi p 


r 
u 



and so the rank of j is full if and only if v(i(p)) and Di p span independent spaces 
for all p £ N. Note that this condition needs only to be verified along the lower 
dimensional submanifold i : N — ► Af, not along the flow lines: the flow of v is a 
diffeomorphism so that D 2 F is full rank, and this makes j an embedding throughout 
the range of definition of the flow if it is an embedding along the initial submanifold. 
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Chapter 5 

Critical Points of the Image Dy- 
namical System 



In this chapter we discuss the role of critical points in the image in determining 
solution surfaces for the shape from shading problem. The main result is that in 
general (i.e. generically, in the absence of special symmetries) near certain kinds of 
critical points in the image there are at least two and at most four possible solution 
surfaces. These critical points are due to critical points in the reflectance function, 
and are either the result of local maxima (the usual case) or minima in the reflectance 
function. More complicated behavior may occur with saddles in the reflectance 
function; more work is needed to understand this case completely. 

We use the image dynamical system developed in the last chapter and apply a 
technique from dynamical systems theory called linearization to study the behavior 
of the dynamical system near the critical points. Linearization essentially involves 
looking at the first order behavior of the characteristic equations around the critical 
point to get a linear dynamical system with behavior similar to the nonlinear system. 
A linear dynamical system X = A • x can be analyzed by looking at eigenvalues and 
eigenvectors of the matrix A, and in general invariant subspaces of linear dynamical 
systems are vector spaces spanned by sets of eigenvectors. If the eigenvalues of the 
nonlinear system do not have zero real parts, the invariant manifolds of the nonlinear 
system are topologically isomorphic to those of the linear system on a neighborhood 
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of the critical point. We analyze the image dynamical system in this way to get 
results. 

We look to see when critical points in the image are "good" critical points to 
which this analysis can be applied, and we look at the connection between the re- 
flectance function critical point type, the surface type at the critical point, and the 
type of the two-dimensional image dynamical system restricted to the correct solution 
surface. 

Two of the invariant manifolds that are possible solution surfaces are the stable 
and unstable manifold. Simply reversing time interchanges the labels, so it is of in- 
terest to have ways to find, say, unstable manifolds for the image dynamical system. 
In Section 5.3 of this chapter, we show a method based on a mathematical theorem 
called the Lambda Lemma: we take an initial surface that cuts the stable mani- 
fold, and allow it to flow forward using the image dynamical system. The Lambda 
Lemma says that as t goes to infinity, the deformed surface will C approach the 
unstable manifold (Palis and de Melo, 1982). We show some experiments using the 
Connection Machine, a highly parallel computer, on implementing this idea. The 
resulting methods seem to have good noise tolerance and robustness in the face of 
wrong information about reflectance functions. 

5.1 Mathematical Preliminaries 

Modern dynamics has emphasized the study of the local behavior of trajectories 
near critical elements of a system. There are three types of critical elements to 
consider: one type is the closed orbit with a finite period; another type is a critical 
point of the vector field, i.e. a point p where X(p) — 0; this can be considered as 
a trajectory with infinitely short period. Finally, there are other critical elements 
which can be described as chaotic. We will concern ourselves exclusively with critical 
points; existence and properties of the other critical elements for an image dynamical 
system is an open area for research. 
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Dynamicists have been very interested in the stable and unstable manifolds, S + 
and S~, associated with a critical element in a vector field as a way of classifying 
the critical point and studying its interaction with other critical points. The stable 
manifold contains those trajectories that wind towards the critical element as time 
along the trajectory proceeds in the positive direction; the unstable manifold con- 
sists of the trajectories that wind towards the critical element with decreasing time. 
"Wind toward" means that the critical elements are approached in the limit as time 
runs to infinity, positive or negative in the stable and unstable cases respectively. 
In our situation, "time" is just a parameter along the trajectories of interest, so the 
stable and unstable manifolds are quite similar in character: they are sets that are 
invariant under the flow of the vector field, they include the critical element, and 
they contain trajectories that all approach the critical element asymptotically. 

5.1.1 Critical Points and Invariant Manifolds 

Much of the methodology comes from (Abraham and Marsden, 1985). To ex- 
amine in more detail the trajectories near a critical point, we can use the linearized 
version of the vector field. If p is a critical point for the vector field X on a manifold 
M, we define the linearized vector field X' on the vector space T P M around pasa 

linear map 

X' : T p M — > T p M 

X'(v) = ±(DG x (v))\ x=0 , 

where G\ is the flow for X. This definition makes sense, as G\(p) = p since p is a 
critical point for X, so 

DG\ : T P M — > T P M, 

for all A. The curve A (-► DG\(v) for fixed v is a curve in the fixed vector space T p M 
and so we can sensibly take the derivative with respect to A and still get a value in 
the vector space T P M. 
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In a coordinate system x on M, it can be shown that the matrix of X' is simply 
given by 

[*'] = 



dX l 



dx> 

(Abraham and Marsden, 1985). We consider T P M to be an approximation to the 
manifold near p: a small vector v in T p M represents a point near p. X'(v) is the 
first order linear approximation to the vector field at this approximate location. 
This is also suggested by a Taylor series argument in 1R : we have X(x + h) = 

X(x) + DX x (h) -\ ; our X' in coordinates is essentially the derivative DX. At a 

critical point x c , X(x c ) = 0, so the derivative approximates the vector field near x c . 
What is the behavior of X' on the tangent space to an invariant manifold through 
the critical point? Let S be the invariant manifold containing the critical point p. 
We must have 

G X (S) C S 

by definition, and since G\ is a difFeomorphism, G\{S) is also two-dimensional. Thus, 

D(G X ) P (T P S) = T P S 

for all A and hence X'(T P S) C T p S. Thus, the tangent space T P S to the invariant 
manifold of the vector field X at the critical point is an invariant linear subspace of 
X . If X (T p S) is one-dimensional, then T p S contains a zero- eigenvalue eigenvector 
for X'. 

If we have a linear operator on a vector space, A : V — ► V, what are the 
invariant linear subspaces of it? Linear algebra provides some information about 
this: if A is diagonalizable, every two-dimensional invariant subspace is the span 
of two eigenvectors of A. 1 If A has eigenvalues with multiplicity 1, the number of 



1 Concepts from (Hoffman and Kunze, 1971): if W is an invariant subspace of A so that AW C W, we 
can define A\\y : W — ► W as the restriction of A to W. The minimal and characteristic polynomials for 
A\w divide the minimal and characteristic polynomials of A. The characteristic polynomial is defined as 
det(j4 — XI), while the minimal polynomial is the unique polynomial g(X) of lowest degree which annhilates 
A (i.e. g(A) = 0) with 1 as the coefficient of the highest power of A. The Cayley-Hamilton theorem says 

94 



Chapter 5 Critical Points of the Image Dynamical System 

possible W is quite constrained: the finite number of pairs of different eigenvectors. 
If A has an eigenvalue with multiplicity greater than 1 , and hence an eigenspace of 
dimension 2 or more, then any combination of an eigenvector from this eigenspace and 
another eigenvector from another eigenspace can give rise to an invariant subspace: 
the number of possible invariant subspaces is now infinite (but there is still constraint 
on the orientations.) 

Returning to the dynamical system, if X' is diagonalizable with eigenvalues of 
multiplicity 1, then there are a finite number of invariant linear subspace in T P M. 
There are only a finite number of possible solution surface tangent planes through 
the critical point that are consistent with the dynamical system. 

The local stable manifold theorem (Abraham and Marsden, 1985) indicates that 
the invariant subspace spanned by the set of eigenvectors of X' with negative eigen- 
values gives rise to a unique invariant manifold of X near the critical point, called 
the stable manifold, S + ; the set associated with positive eigenvalues gives rise to the 
unique unstable manifold, S~. Points on S + will move asymptotically close to p in 
the limit of increasing time; points on S~ will do the same in the limit of decreasing 
time. The dynamical system restricted to the stable/unstable manifold looks like a 
sink or a source near the critical point. 

5.1.2 Hyperbolic Critical Points 

To be able to easily examine other invariant manifolds of the flow near a critical 
point, we restrict our attention to a certain class of critical points. If a critical 
point has no eigenvalue with real part equal to zero, it is called a hyperbolic critical 



that the characteristic polynomial of A annihilates A, so the minimal polynomial divides the characteristic 
polynomial. The roots of the characteristic polynomial give the eigenvalues of A, and the multiplicity of 
the roots gives the necessary dimensions of the eigenspaces if A is to have a basis of characteristic vectors, 
i.e., if A is to be diagonalizable. It turns out that A is diagonalizable if and only if the minimal polynomial 
has linear factors. Assuming A is diagonalizable, then the minimal polynomial of A\w must also be the 
product of linear factors, since it divides the minimal polynomial of A. Hence, A\w is diagonalizable as 
well. The eigenvalues of A\w are a subset of the eigenvalues of A. A characteristic vector of A\w in W is 
also a characteristic vector of A in Vj so W is spanned by a pair of characteristic vectors of A. 
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Unstable 
Manifold 



Figure 5.1. Pendulum hyperbolic critical point at (w, 0). 



Stable 
Manifold 



point. Such a critical point is stable under generic perturbations of the vector field: 
that is, nearby vector fields in the space of all vector fields also have a hyperbolic 
critical point near this one, with the same number of eigenvalues with positive and 
negative real parts (Abraham and Marsden, 1985). This reflects the generic nature 
of matrices (corresponding to various possible X') with eigenvalues that are non-zero 
and not purely imaginary. In addition, the Grobman-Hartman theorem (Palis and 
de Melo, 1982) asserts that there is a neighborhood of a hyperbolic critical point in 
which the nonlinear flow lines are homeomorphic to the flow lines of the linearized 
dynamical system X' on T p M. Thus, in a neighborhood of a hyperbolic critical 
point, the invariant (linear) subspaces of the dynamical system X' generated by 
pairs of eigenvectors of X' are homeomorphic to invariant manifolds of the nonlinear 
dynamical system X on M. Not only are we assured of the existence of the stable 
and unstable manifolds in this case, but any pair of eigenvectors, e.g. one associated 
with a positive real part eigenvalue and one associated with a negative real part 
eigenvalue, generate an invariant manifold in the neighborhood of the critical point. 
In the latter case, the flow restricted to the two-dimensional invariant manifold will 
be a saddle flow. 
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J~+ 




Figure 5.2. Hyperbolic critical point of a dynamical system in IR 3 . 



A simple example of a hyperbolic critical point is given by the pendulum example 
from Chapter 4 at (7r,0), the top of the pendulum's arc. The stable and unstable 
manifolds are indicated in Figure 5.1 — they are the unique manifolds containing only 
trajectories that all approach the critical point, either with positive time (unstable 
manifold) or negative time (stable manifold). 

We can see this behavior in IR as well. In Fig. 5.2 a hyperbolic critical point 
of a dynamical system (reminiscent of the wash from helicopter blades) is shown 
with two-dimensional unstable manifold (the ground) and one-dimensional stable 
manifold (vertical axis). The dynamical system restricted to the stable manifold is a 
source. There is another invariant manifold: a "saddle" invariant manifold consisting 
of the plane spanned by one eigenvector with a negative eigenvalue and one with a 
positive eigenvalue. 

As we shall discuss in Section 5.2.3, critical points in C(IR 3 ,2) in the space- 
invariant reflectance function case are not isolated: there will be a one parameter 
family of critical points as one moves along the projection direction due to the space 
invariant symmetry of the problem. This is a reflection of the well-known depth 
ambiguity. It makes sense in this case to reduce the dimensionality of the problem, 
effectively by ignoring the depth coordinate, to make the critical point an isolated 
hyperbolic point. 
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5.2 Image Critical Points 

Both Horn (Horn, 1975) and Bruss (Bruss, 1980) recognized the potential im- 
portance of critical points in the image intensities. As Horn pointed out, isolated 
image intensity critical points are likely to be due to critical points in the reflectance 
function; if these are also isolated, then the potential surface normal directions are 
limited to just these critical points on the reflectance function. As he indicated, one 
cannot draw out a solution surface from this known point in C(1R 3 ,2) with char- 
acteristic trajectories because the vector field is stationary there. His solution was 
to make a spherical cap through the critical point with an arbitrarily (large) cho- 
sen radius and consistent with the presumed normal. Solution curves could then be 
extended from this cap. He showed empirical evidence that these solution curves 
did not change much as the radius was changed, suggesting a certain stability of the 
solution trajectories. 

Bruss, although working towards uniqueness of solutions in a very restricted 
domain of reflectance functions, those with elliptical constant brightness contours in 
p, q space, made mention and use of the ideas of the stable and unstable manifolds 
associated with a critical point to demonstrate existence of solutions for her particular 
problem. We expand upon this. 

5.2.1 Location of Critical Points 

When are critical points of the image actually due to critical points in the re- 
flectance function? In the case of a space invariant reflectance function, R depends 
only on the orientation part of i : S — > C(IR 3 , 2). The orientation part of i is effec- 
tively the map from points on the surface to the normal directions in space and so 
is related to the Gauss map, N. 

In the study of surface shape, one examines the Gauss map N : i(S) — > S 2 C 
IR which maps points on the surface embedded in IR 3 to unit normals representing 
the orientation of the surface (Figure 5.3). The Weingarten map or shape operator 
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IR 



N 




Figure 5.3. Gauss map. It maps surface normals of i(S) to S 2 C IR 3 



is the derivative DN of this map. Since both the tangent plane to the embedded 
surface and the tangent plane to the embedded sphere are perpendicular to the 
surface normal, we can consider DN : Ti(S) — ► Ti(S). 2 If we examine DN at 
a point on the surface in IR 3 , it tells us how the orientation of the surface begins 
to change as we begin to move in different directions on the surface; for example 
the eigenvalues and eigenvectors of DN are the principal curvatures and principal 
directions along the surface. Here again, the tangent plane to S at x (in IR 3 ) is used 
as an approximation for the surface, and the value of DN tells approximately how the 
orientation has changed as one moves approximately to a nearby point on the surface. 
Thus we expect the characteristic vector field X on C(IR 3 , 2) to give us information 
consistent with DN in the direction of the characteristic trajectories: the orientation 
component of X is related to DN(X sp ), where X sp is the space component of X. 

Let us assume we have the embedding i : S -» IR 3 . We also have the lifted 
embedding i : S -»• C(1R 3 ,2) which takes p € S to the tangent plane T P S C T P 1R 3 . 
Since C(IR 3 ,2) ~ IR 3 x C(2,3), where G(2,3) is the set of all two-dimensional 
subspaces of IR , we can divide i into two components: \(p) = (i(p),n(p)), where 
i(p) e IR 3 is the space component of the embedding and n(p) e G(2, 3) is the 
orientation component. We have a natural map x '■ S 2 -»• G(2, 3) given by mapping 



A modern view of the shape map can be found in (Thorpe, 1979). 
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a normal vector to the plane perpendicular to it. We can now write xoJVoz(p) = n(p), 
and if we define R = R o x, we can write the image irradiance equation as 

Eo-K I oi = RoNoi. 

Taking the derivative and looking for critical points in the image, we have 

DE o Z)(7r 7 o i) = DRo DN o Di, 

where DN is the Weingarten map. If we assume we are on the interior of the image, 
then D(tt o i) is an invertible linear map, and so the image will have a critical point, 
that is DE — 0, if and only if 

= DRoDN, 

since i is a diffeomorphism between S and i(S). When does this condition occur? 

This condition certainly occurs when DN = at a point. This is a locally flat 
point, a point where both principle curvatures are zero. 

The condition can also occur when DN has rank one, i.e. when DN has just 
one non-zero eigenvalue. This occurs at parabolic points that are not locally flat. 
We get a critical point in the brightness if the principle direction, spanning the one- 
dimensional range of DN, is in the null space of DR; otherwise phrased, the principle 
direction must be perpendicular to the gradient of R. 

If this latter situation occurs, is it a stable occurrence? Could a small perturba- 
tion of the reflectance function remove this occurrence? Since in general parabolic 
points form curves on the surface, there will be a one dimensional curve of non-zero 
eigenvalue principle directions on S parameterized by the parabolic curve, one at 
each point of the curve. There is also a one dimensional set of null directions for DR 
parameterized by the parabolic curve on S. At each point on the parabolic curve, 
there is only a two-dimensional vector space of possible tangent vectors since T p S 
has dimension two, and hence only a one-dimensional space of tangent orientations 
(parameterized, for example, by angle). We can locally model this one dimensional 
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Figure 5.4. Tube of possible orientations of principle direction for DN (ITi) and null direction for DR 
(1^) along a parabolic line, a{t). An orientation at time t is defined as a point on the unit circle with t as 
its center; this picks an angle, 8, for the orientation of the direction vector. The intersection, p c , of the two 
curves gives a critical point of the image on the parabolic line. 

set of orientations as a circle, and consider the parabolic line together with the pos- 
sible orientation directions as a two dimensional tube IR x S 1 : one coordinate gives 
distance along the parabolic curve, the other coordinate (around the circle) gives 
an orientation (Figure 5.4). The null directions of DR and the non-zero eigenvalue 
principle directions form curves that sit on the tube: at each point of the parabolic 
curve, they pick out two orientations on the circle. If there is a point on the surface 
where these two coincide, we can consider the curves on the tube to have intersected. 
This will be a point where the principle direction falls in the null space of the gra- 
dient of R, and hence is the critical point of interest. Since this is an intersection of 
one dimensional curves on a two dimensional surface, the intersection at a point is 
transversal, i.e. stable to small smooth changes in the parabolic line location or the 
principal directions or the reflectance function. 

Koenderink and Van Doom (Koenderink and Van Doom, 1979) have also de- 
scribed this kind of point on the surface where the image has critical points. If we 
consider the Gauss map, N, as laying the surface i(S) onto the unit sphere, then the 
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Parabolic point 

/ 




Parabolic point 



Section through Gauss map of S 



Constant brightness contour 
for reflectance map 



Figure 5.5. The surface S mapped to the Gaussian sphere: parabolic lines (dashed lines) are folds in this 
map. a. An exploded section through the Gauss map of a surface onto the sphere — the fold lines are lines 
of parabolic points, and there is a triple thickness of surface mapped between the folds, b. Relationship 
between the Gauss map and reflectance function constant brightness contours (solid lines on the sphere). 
The dashed lines are folds with a triple thickness of surface mapped between them; p e is a parabolic point 
with maximum brightness on a fold, and so will be a saddle critical point for the image. 

parabolic lines on the surface are the lines of the folds for the image of the surface 
(Figure 5.5 a). If we overlay the reflectance function on top of the Gaussian sphere, 
then a maximum in brightness along the parabolic line, or fold, will correspond to a 
critical point in the image even though the local maxima of the reflectance function 
will be elsewhere: effectively, the fold has trapped the brightness values (Figure 5.5 
b). If we consider a curve a : 1R — > S on the surface crossing a parabolic curve and 
consider the corresponding path on the Gaussian sphere which connects the surface 
normals along a, this path on the Gaussian sphere touches the parabolic fold and 
does not cross it. If the path runs in a certain direction, the path will actually double 
back on itself, providing one critical direction for the brightness values. This will 
be true at any parabolic point. If there is a maximum or minimum of the bright- 
ness along the parabolic fold, this will provide the other critical direction to make 
a critical point for brightness on the fold. Depending on the relationship between 
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the constant brightness contours on the Gaussian sphere and the parabolic line, this 
critical point in the image can be a maximum, a minimum, or a saddle. 

Since the characteristic vector field in the usual coordinates has the form 



X = 



R p 

Rq 

pR p + qR q 

E x 

Ey 



these parabolic image critical points will not be critical points in the image dynamical 
system, since DR is not zero and so the vector field is not zero. From the form of the 
vector field X, the space part of X is non-zero at these points while the orientation 
part goes through a zero; there is only a minor influence of these points on the flow 
picture in space. 

Finally, if DN is of full rank, then necessarily a critical point in the image must 
be due to a critical point in R, since DN spans the effective domain of DR, and so 
DR o DN = implies DR = 0. 

In summary, we have several kinds of critical points: 1) due to DN = 0, i.e. 
locally flat points; 2) at certain locations on parabolic lines; and 3) at points where 
the reflectance function has a critical point. Critical points in the image that are due 
solely to critical points in the reflectance function will be called good critical points; 
all others will be called bad. Note also that the bad critical points in the image are 
almost always not due to reflectance function critical points, and so the brightness of 
these image critical points will not be the same as those due to reflectance function 
critical points. In the simplest case of a reflectance function with single maximum 
(e.g. a Lambertian reflectance function) generically only the good critical points 
have the brightest value of the reflectance function. 

We also ignore in this analysis the role of shadows and shadow boundaries. The 
most egregious omission is the role of the self-shadow boundary which can be con- 
sidered to be due to the reflectance function: usually there is an extended set of 



103 



Chapter 5 Critical Points of the Image Dynamical System 

orientations for which the reflectance is at a minimum because patches of the sur- 
face are turned completely away from the light source. We will focus exclusively on 
images (or portions of images) without self-shadows. 3 

5.2.2 Good Critical Points 

If we are at a good critical point, what kind of critical point can it be? We 
can look at the second derivative of the image irradiance equation to get some ideas 
about this. 

We will implicitly assume a coordinate system for the calculations-this will allow 
expressions like D 2 f to make sense on their own, as the matrix of second derivatives 
with respect to the coordinates. 4 

For conveniance, we define N = Noi. Taking derivatives of the image irradiance 
equation as originally expressed, we have 

Eon 1 o i(p) = R o N(p) 
DE w r oi{p) o D^ 1 o »)„(v) = DRfj {p) o DN p (v) 
D 2 E(D(x I o «)(u), £>(7r 7 o »)(v)) + DE o D 2 (tt i o »)(u, v) = 

D 2 R{DN(u), DN(v)) + DRo D 2 N(u, v). 

At a good critical point both DE and DR are 0, so we have 

D 2 E(D(tt I o »)(u), Z)(7r 7 o j)(v)) = D 2 R(DN(u), DN(v)) 



Note that the self-shadow line is a constant brightness contour; it is the "dark side" of the self-shadow 
line that is not included here. 

One could define second derivatives independently of coordinates in two ways: one way is to consider 
derivative maps of real functions as maps from the tangent bundle TM to IR. The second derivative is 
essentially the derivative of this map. In coordinates, the most interesting component ends up being a 
linear combination of the second derivative in coordinates and the first derivative in coordinates. Another 
way to deal with second derivatives on surfaces is to specify a particular way of tying the tangent planes 
together: this is called a connection, and effectively tells how to take derivatives of arbitrary tensors on the 
surface. For our purposes we do not need either construction. 
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In the interior of the image, tt 1 o i will be a local diffeomorphism; if we are not on 
a parabolic line, DA'' = D(N o i) will also be a local diffeomorphism, and hence 
D E and D R will be of the same rank and type (i.e. same numbers of positive and 
negative eigenvalues). 

To see this, we can consider the different derivatives in this last expression as 
matrices. "D 2 E n is a bilinear function of its arguments, so if we use a matrix 
representation and let A = D 2 E, B = D 2 R, P = D(-k t o i), and Q = DN, then we 
have for all u and v in the tangent space to S at p: 

u T P T APv = u T Q T BQv, 

and hence 

P T AP = Q T BQ. 

We can write 

A = (QP-yBiQP- 1 ). 

In Appendix A5.1 to this chapter, we show that this means A and B have the 
same number of positive and negative eigenvalues, so that D 2 E is positive (negative) 
definite if and only if D 2 R is. We will assume from here on that D 2 R is full rank, 
the usual and generic case. 

If we are at a good critical point which is a maximum of the reflectance function 
so that D 2 R is negative definite, D 2 E will also be negative definite, and hence 
the critical point in the image will be a local maximum. Such a critical point is 
surrounded by closed constant brightness contours, and the presence of such a critical 
point in a region can be strongly suspected if there are closed constant brightness 
contours in the region which are increasing in value towards the inside. In a sense, a 
maximum brightness point has an image "signature" that extends around the point. 
We suspect that such a signature without the actual presence of the critical point 
(e.g. the critical point is obscured) may be sufficient to get strong limits on possible 
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solution surfaces. (Brooks (Brooks, 1982) discusses an example of an annular image 
of a hemisphere lit from the viewing direction with the hole of the annulus placed 
over the critical point.) 

If the reflectance function has a single maximum point (e.g. the Lambertian 
case) this will be the only critical point in the reflectance function, and will generate 
the only good critical points in the image. In this case, any saddles in the image will 
be due to parabolic points. With more general reflectance functions, saddles in the 
reflectance function can lead to saddles in the image at good critical points. 

5.2.3 The Linearized and Reduced Image Dynamical Sys- 
tem 

The good critical points generate critical points of the image dynamical system. 
A solution surface consistent with a smooth image patch containing a good critical 
point should be a smooth surface made up of characteristic trajectories; this means 
we are interested in locally invariant manifolds of the image dynamical system which 
include the critical point. As indicated in Section 5.1.1, an important technique for 
studying the behavior of a dynamical system around a critical point is to look at the 
linearization of the dynamical system at the critical point. 

We can examine in detail the behavior of an image dynamical system under the 
conditions used to derive a coordinate expression for X in Chapter 4: we assume 
orthographic projection and a space-invariant reflectance function. As derived in 
Chapter 4, this leads to a characteristic vector field on the image interior of the form 



X = 



R p 

R q 

pR p + qR q 
E x 
E y 



using the standard rectilinear (x,y,z,p,q) coordinate system on IR 3 and C(IR 3 ,2), 
with image projection given by v^x, y, z) = (x, y). We can calculate the linearization 
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of this vector field at a critical point p c where = R p = R q = E x = E y as 

X' 

where we use the notation E xx to mean (Eow c ) xx , where tt c is the image projection 
from C(IR ,2), 5 and we have substituted the values R p = R g = after taking the 
derivative. 

By inspection we can see that the matrix X' is singular, meaning X' has at least 
one zero eigenvector. It is the vector (0, 0, 1, 0,0) r and reflects the translation invari- 
ance along the projection direction. This means we are not at a hyperbolic critical 
point, which would require non-zero eigenvalues for the linearized system. Although 
there are some results on the existence and uniqueness of invariant manifolds for 
non-hyperbolic critical points, the Grobman-Hartman theorem mentioned in Section 
5.1.2 suggests it would be convenient to make the problem into a dynamical system 
with a hyperbolic critical point. Among other difficulties, a non-hyperbolic critical 
point is not generic: a small perturbation can radically change the character of the 
dynamical system. 

We can convert our image dynamical system by focusing on the (x, y,p, q) coordi- 
nates and leaving out the z coordinate. Effectively, the ordinary differential equation 
defining the dynamical system, 

x = X, 



This small confusion in notation turns out to be correct here because the image projection, ^{x, y, z) = 
(x,y), in coordinates makes E xx = (E o ~n. c ) xx , where the first set of derivatives are with respect to the 
image coordinates and the second set are with respect to coordinates on C(IR 3 , 2). 
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does not depend on z on the right hand side; we can solve for (x,y,p,q) first, and 
then integrate the z equation to find the solution for z. 6 In coordinates the reduced 
dynamical system is 



X = 



R p 

Rq 

E x 
E,, 



with 



X' = 
















xy 



Thpp Xtj 
■^qp •K't 






pq 



qq 








We will call the new space on which X is a vector field C(IR 3 , 2). 
5.2.4 Eigenvalues of X' 

What are the eigenvalues and eigenvectors of X't Examining the matrix form of 
X' we have: 



X' = 



D 2 R 
D 2 E 



where D 2 E (shorthand here for D 2 (E o ir c )) and D 2 R are both 2x2 matrices 
representing second derivatives oiEow c and R with respect to space and orientation 
coordinates respectively. We can consider the eigenvalues of 

D 2 RoD 2 E 



X'X' = (X 1 ) 2 = 




D 2 EoD 2 R 



we have 



= det((X') 2 - 7/) = det(D 2 R o D 2 E - 7/) det(D 2 E o D 2 R - 7/) 
= (det(Z) 2 Eo J D 2 J R- 7 /))) 2 , 



It is also possible to do this in an invariant manner by using the symmetry group corresponding to 
flow down the projection direction: the dynamical system is invariant to this symmetry, and so the order 
of the system can be reduced by one. We have a new dynamical system denned on the space J x G(2, 3) 
where G(2, 3) is the space of two-dimensional subspaces of a three-dimensional vector space, and I can be 
considered either as the space of projection fibres or as the image. 
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so the eigenvalues of (X') 2 are the same as the eigenvalues of D 2 E o D 2 R. We also 
have 

det((X') 2 - 7/) = det(X' - A J) det(X' + A/), 

where A 2 = 7. If 7 is a characteristic value for (X 1 ) 2 , then ±A are possible char- 
acteristic values for X'; one of them must be a characteristic value. It turns out 
(see below) that both actually are characteristic values of X'. The 2x2 submatrix 
A = D Ro D 2 E has a quadratic characteristic polynomial, and the characteristic 
equation for (X 1 ) 2 is 

= ( 7 2 + trace(A) 7 + det(A)) 2 . 

This means the characteristic values for X' must be the four roots of 

= A 4 + trace(A)A 2 + det(A), 

and so come in pairs as the two square roots of each characteristic eigenvalue for A. 7 
As discussed in Section 5.2.2, D 2 R and D 2 E have the same number of positive 
and negative eigenvalues. In the case where both matrices are definite, we show in 
Appendix A 5.3 that A = D 2 R o D 2 E will have positive real eigenvalues, and hence 
X' will have real eigenvalues, generically distinct. In the case where D 2 R is indefinite, 
we may generically have negative roots for A, yielding purely imaginary eigenvalues 
for X' and the critical point may therefore not be hyperbolic. We will focus our 
attention on very good critical points, those due only to definite critical points of 
the reflectance function: the usual case of the maximum of reflectance function is a 
very good critical point. The analysis of invariant manifolds of the image dynamical 
system at indefinite critical points of the reflectance function is an area of future 
research. 



The coordinate form of X suggests a four-dimensional Hamiltonian dynamical system with (x, y, p, q) 
coordinates as Darboux coordinates for the 2-form u = dx A dp + dy A dq and the image function H as 
Hamiltonian (Abraham and Marsden, 1985). The distribution of eigenvalues is the result of the matrix X' 
being infinitessimally symplectic. 
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In general (i.e. generically in some sense) at a very good critical point there will 
be four distinct eigenvalues of X' with four associated independent eigenvectors. Us- 
ing the results os Section 5.11, pairs of these eigenvectors generate possible invariant 
tangent planes, six in all. If there are symmetries in the system (for example the 
image of an object symmetric around the viewing axis lit from the viewing direc- 
tion) we can get eigenvalues with multiplicity two, which potentially means a two 
dimensional space of eigenvectors. From the stable manifold theorem there will still 
be unique stable and unstable manifolds, but now there may be an infinite number 
of possible saddle manifolds and hence solution surfaces because there may be an 
infinite number of subspaces spanned by pairs of eigenvectors. We will concentrate 
on the generic case without symmetries. 

Not all possible pairs of eigenvectors will generate tangent planes in C(IR 3 ,2) 
that could possibly correspond to a real two dimensional surface embedded in three 
dimensions. We know that X' has the form 



X' = 



D 2 R 
D 2 E 



where D 2 R and D 2 E are essentially the second derivative matrices of the reflectance 
and image at the critical points. If u is an eigenvector for X', we have 











X'u = Au 



D 2 E 


D 2 R 





lisp 


= 






'D 
D 


2 1 
2 1 


XU or 
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AU S p 

Au or _ 



Now let u' = (-u sp ,u or ) T 



X'u' = 
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-Au„ r 
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so that —A is also an eigenvalue, and u' is its associated eigenvector. 

Although u and u' may be independent as eigenvectors associated with distinct 
eigenvalues in the four-dimensional reduced space C(IR 3 ,2), the space parts of the 
eigenvectors u and u' do not project to independent vectors in IR 2 and hence the 
subspace spanned by u and u' does not correspond to a two dimensional surface in 
IR . This eliminates two of the six pairings of eigenvectors. 

By the Grobman-Hartman theorem, the remaining four possible invariant tan- 
gent spaces correspond to four possible invariant manifolds of X through the critical 
point that could correspond to solution surfaces. One will be the stable manifold 
with sink flow type when restricted to the invariant surface; another will be the un- 
stable manifold, with source flow type on the invariant manifold; the two remaining 
invariant manifolds will have flows that are of saddle type. 

There is one other possible constraint that may eliminate the saddle invariant 
manifolds as candidates. For a solution surface S C C(IR 3 , 2) to solve the shape from 
shading problem at hand, we must have R(p, q) = E(x, y) everywhere on it; this can 
be written using the image dynamical system function as H(p) = for all p € S. 
The characteristic trajectories are curves on which H is constant, but not necessarily 
zero. In the case of the stable and unstable manifolds, all the trajectories in them 
approach the critical point p c asymptotically, and H(p c ) = if we have matched the 
critical point of the image to the correct critical point of the reflectance function. 
This means H = automatically on the unstable and stable manifolds. However, 
there are only four trajectories in the saddle case (referring to the image dynamical 
system type restricted to the invariant manifold) that actually approach the critical 
point; it is conceivable that the other characteristic trajectories making up the saddle 
invariant solution surface have H non-zero along their lengths. If this were the case, 
these invariant solution surface candidates could be rejected as possible solutions to 
the shape from shading problem. 

However, dynamical saddle invariant solutions do occur and cannot be discarded 
out of hand as possible solutions. As we shall see in the next section, for very 
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good critical points at which the true surface is saddle-shaped in space, the image 
dynamical system restricted to the true surface is of saddle type. If, as in this case, 
the image dynamical system restricted to the true solution surface is of saddle type at 
the critical point, then H must be on all the constituent characteristic trajectories 
even though they do not all approach the critical point. 

5.2.5 Solution Surfaces Near a Good Critical Point 

Given a solution surface with a particular Weingarten map DN describing the 
local surface behavior near a critical point, what is the relationship between the sur- 
face shape, the reflectance function critical point type, and the type of characteristic 
flow generated on the surface? The characteristic vector field restricted to the two- 
dimensional invariant solution surface defines a two-dimensional dynamical system 
on the surface. We are interested in the two-dimensional flow on the solution surface 
near critical points of the system: is it a source, sink, or saddle? 

As in Section 5.2.4, we can decompose the vector field X into two pieces, 



X = 






where X sp is the space component and X or is the orientation component. We can 
similarly divide X 1 , the linear approximation to the vector field at a critical point 
into space and orientation components 

Y ' 

■s^sp 



Y ' 



x' = 

From the form of X' derived in Section 5.2.1, we get 

X S p (u S p, u or ) = D R o u or . 

This represents the linear approximation to the space part of the vector field X at 
an approximate displacement (u S p,u or ) from the critical point p in C(IR 3 ,2). 
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In Section 5.2.1 we looked at the relationship between the Gauss map N and the 
orientation component of the lifted embedding i : S —* C(IR 3 , 2): we saw that \{p) = 
(i(p),n(p)) e Ht 3 x (7(2,3) and x(N(i(P))) = n (p) is tne ma P relating the surface 
normal N(i(p)) to the tangent plane n(p) at p. We must have D\ ° DN o Di = Dn. 
If we assume the embedding i : S — ► M 3 is the inclusion and let i : S — ► C(IR 3 ,2) 
instead of C(IR ,2) (effectively we drop the depth coordinate), then we can write 
Di(u sp ) = (u sp , Dn(u sp )) = (u sp ,u or ), so 

u or = Dn(u sp ) = Dx o DN(u sp ). 

The u or component tells how the orientation direction changes on the surface while 
moving on the surface in the u sp direction — this is very nearly what the Weingarten 
map tells us too. Combining this with the previous expression for X sp we get 



X sp '(u sp , u or ) = D 2 R o Dx o DN o u 



sp- 



Considering u or as essentially determined by the Gauss map N for the fixed embed- 
ded surface S and u 3p , as a vector field on the surface S we have 



Xsp' (u, p ) = D 2 RoD X oDNou 



sp- 



This tells us that the approximate space component of the characteristic vector field 
restricted to a solution surface near a critical point is determined by the derivative 
map D 2 Ro Dx o DN. We are interested in finding the type of the two-dimensional 
dynamical system on S determined by X sp . 

By picking local coordinates on the Grassman manifold of two-dimensional planes 
in three dimensions, (7(2,3), to match coordinates on the two-dimensional sphere S 2 
(e.g. both using central projection onto the same plane z = -1 to give standard 
(x,y,z,p,q) coordinates), we can make Dx become the identity matrix in these 
coordinates; we will let B be the matrix for D 2 R, C be the matrix for DN, and 
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A = BC be the matrix for D 2 R o D\ ° DN. The eigenvalues for A will tell us the 
type of the dynamical system on S. 

A is the product of two symmetric matrices but itself need not be symmetric. 
For an asymmetric matrix, eigenvalues may be complex; when does this happen to 
A? In Appendix A5.2 to this chapter we show that if either B or C is definite, then 
A has real eigenvalues and has a full set of eigenvectors. The only case in which 
A = D 2 Ro DN will not always have real eigenvalues is the case where we have a 
saddle in the reflectance function and a saddle in the surface. 

Assume A has real eigenvalues. If ji and 72 are the eigenvalues of B = D 2 R, 
and «i and «2 are the eigenvalues (curvatures) of C = DN, then 

det(A) = 7172K1K2 = a\a2, 

where «i and a 2 are eigenvalues for A. If 7x72 > 0, then the sign of det(A) equals 
the sign of kik 2 . Thus, if R has a maximum at the critical point so that D 2 R is 
negative definite, and the shape of the solution surface is either convex or concave, 
giving same sign principle curvatures, then the space part of the characteristic flow 
restricted to the two dimensional spatial solution surface is either a source or sink 
since the eigenvalues of A will have the same sign. Similarly if the solution surface 
is a saddle at the critical point, then the space part of the characteristic flow will be 
a saddle flow. 
We also have 

trace(A) = trace(£C) = tr&ce(BPAP T ) = trace(P T J BPA), 

where P is an orthonormal matrix of eigenvectors for C = DN and A is a diagonal 
matrix of eigenvalues for C, so P diagonalizes C. Let us assume the diagonal entries 
of P BP are d n and J 2 2 5 the eigenvalues of C are the principle curvatures «! and 
«2- Then 

trace(A) = aj + a 2 = «ldn + «2</ 2 2. 
114 



Chapter 5 Critical Points of the Image Dynamical System 




a. 




Figure 5.6. Patches of surfaces lit from above. In the middle of each patch we have a critical point due to 
a maximum in the reflectance function; the space part of the dynamical system restricted to the (invariant) 
surface patch is shown, a. Convex surface patch, b. S addle shaped surface patch. 

If D 2 R = B is negative definite so is P T BP, and hence d n and d 2 2 are both less 
than zero: we have 

(1 0)P T Bp(^)=dn<0, 

and similarly for d 2 2- In this case, if DN = C is negative (positive) definite, then 
we know that trace(^) is greater than (less than) zero. The eigenvalues a^ and 
«2, having the same sign by the previous argument, are positive (negative) and the 
characteristic flow must be a source (sink). We get similar results if D 2 R is positive 
definite at the critical point, suggesting a minimum in the reflectance function. 

Assuming A has real eigenvalues (for example, if DN is definite), a saddle in 
the reflectance function determines a saddle flow on the invariant manifold by the 
sign of the product of the eigenvalues of A. However, in the case where DN is not 
definite we are not assured that the eigenvalues of A are real, and the flow type is 
undetermined by these simple arguments. 

Figure 5.6 gives two examples for surfaces lit from above with very good critical 
points due to reflectance function maxima: Figure 5.6a shows a convex surface, on 
which the image dynamical system restricts to a sink, and Figure 5.6b shows a saddle 
shaped surface, on which the image dynamical system restricts to a saddle dynamical 
system. 
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Positive Definite 


Negative Definite 


Saddle 


Positive Definite 


Source 


Sink 


Saddle 


Negative Definite 


Sink 


Source 


Saddle 


Saddle 


Saddle 


Saddle 


Undetermined 



We can summarize the results in the following table. Entries in the table give 
the flow type of the two-dimensional characteristic system restricted to the two- 
dimensional surface. 

Form of DN 



Form of 
D 2 R 



Given an image due to a generic real surface and a known very good critical 
point (i.e. due to a definite (positive or negative) reflectance function critical point), 
there are at least two and at most four smooth surfaces consistent with an image 
patch containing a good critical point. In the usual case of a very good critical point 
due to a maximum in the reflectance function, the invariant solution surface will be 
the stable (sink type flow) or unstable (source type flow) manifold of the dynamical 
system if the surface is convex or concave; the invariant solution will be a saddle-flow 
invariant solution if the surface is saddle shaped. 

5.2.6 The Fundamental Instability of a True Image Irra- 
diance Equation 

A true solution surface should consist of a single two-dimensional invariant man- 
ifold which is the unstable manifold for some critical points, the stable manifold for 
others, and a saddle invariant manifold for the rest. It turns out that this is very 
unusual behavior for a dynamical system. Generically, two two-dimensional invari- 
ant manifolds will only intersect (if they intersect) along a one-dimensional curve 
rather than merging (Abraham and Marsden, 1985). An intuitive explanation for 
this is that two generically transverse two-dimensional surfaces in a four dimensional 
space have a O-dimensional intersection, i.e. a point; if both the surfaces are invari- 
ant under a flow, however, the flow of this point forward and backward in time will 
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Figure 5.7. Two critical points of a two dimensional dynamical system: a. Non-intersecting stable and 
unstable manifolds, b. Intersecting stable and unstable manifolds, c. Non-intersecting stable and unstable 
manifolds. 

also be in the intersection of the two invariant surfaces, leading to a one-dimensional 
intersection. 

We can gain some insight into the situation by looking at two dimensions. In 
Figure 5.7, we show a piece of a two-dimensional dynamical system with two critical 
points, each of which has both a stable and an unstable manifold. In Figure 5.7 a 
and 5.7 c, these manifolds do not intersect; the unstable manifold for one critical 
point is just another trajectory for the other critical point. The case where they 
intersect is shown in Figure 5.7 b. If Figure 5.7 b is perturbed generically, it will fall 
apart to be either like Figure 5.7 a or 5.7 c. 

The image dynamical system due to a real surface is therefore a very delicate 
thing seen from a generic perspective. This has two consequences, one bad, one 
potentially good. The bad news is that as we numerically try to find the stable 
or unstable manifold of a critical point (by directly drawing out trajectories or by 
the methods in the next section), errors will occur as if we had randomly perturbed 
the dynamical system. It is very unlikely that the perturbed unstable manifold will 
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merge with the perturbed invariant manifold coming from another critical point and 
create a single, smooth, two-dimensional surface. The good news is that that this 
may provide a way to tell a bad reflectance function choice from a good one: if 
the invariant manifolds do not "nearly" match up, we must be on the wrong track. 
Either the reflectance function is bad, or there is no surface corresponding to the 
image. 

Note one other theoretical hope for optimism: although generically the two- 
dimensional invariant manifolds intersect in a manifold of dimension at most one (and 
may not intersect at all), we may get better results if we look at a one- (or many-) 
parameter family of vector fields, e.g. by examining a one-parameter family of re- 
flectance functions. A result with a catastrophe theoretic flavor about generic vectors 
fields (Abraham and Marsden, 1985) indicates that if we have a one-parameter fam- 
ily of vector fields X c on a four dimensional space controlled by the real parameter 
c such that at c = a pair of two-dimensional invariant manifolds actually com- 
pletely merge, then perturbing this entire family does not destroy this occurrence: 
in the perturbed family, there will still be a value of the new control parameter such 
that the matching occurs. For example, if we know that the reflectance function 
comes from a one-parameter family of reflectance functions and the image of the 
real surface contains two critical points, then the perturbations of the system due 
to numerical errors or slight errors in the reflectance function will not remove the 
theoretical occurrence of a parameter value such that the invariant manifolds merge. 

In our case, we may need to match more than two invariant manifolds, and we 
may have more parameters in the reflectance function, so the task is harder than the 
above. The generic theory for more than two parameters of control is not well worked 
out, but it may be that adding more parameters of control allows one generically to 
match more invariant surfaces together and thereby determine more parameters of 
the reflectance function. 
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5.3 Invariant Manifold Shape from Shading Algo- 
rithms 

From Section 5.2, we can conclude that good critical points of the image dy- 
namical system provide constraints on reasonable solutions. Given a smooth image 
interior, we expect corresponding solution surfaces also to be smooth. In a neigh- 
borhood of a critical point in the image caused by a critical point of the reflectance 
function, a correct solution surface must be a smooth surface drawn out by character- 
istic curves (and can be considered locally invariant to flow along these curves), and 
must contain the critial point. Thus, a consistent solution surface passing through 
all the critical points of a smooth image interior should be an invariant manifold 
containing all these critical points. In some sense, this is quite special behavior for 
an invariant manifold, as we mention in Section 5.2.6. 

In the space invariant case, we can consider the image dynamical system near 
the critical point to live in the four dimensional space C(]R 3 ,2) which can be given 
the coordinates (x,y,p,q), where x and y are image coordinates, and p and q are 
the "gradient" coordinates for orientation of the surface. The typical (i.e. generic) 
image critical point due to a maximum in the reflectance function will be an isolated, 
hyperbolic critical point of the dynamical system. The Grobman-Hartman theorem 
lets us conclude that the nonlinear image dynamical system near the critical point 
will be topologocally isomorphic to the linearized dynamical system near the critical 
point; as a result, there are in general at least two possible invariant manifolds 
through the usual maximum critical point, the stable and unstable manifolds of the 
critical point, and in the generic case (e.g. no symmetries in the dynamical system) 
at most four. 

These theoretical results indicate that one can find possible solution surfaces for 
an image in the neighborhood of a critical point by finding invariant manifolds of the 
critical point. One technique is to pick a curve in C(IR 3 , 2) (an initial strip) that lies 
approximately close to the invariant manifold of interest, and then use the flow to 
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stretch this curve out to cover a larger region. This is the essence of Horn's original 
attempt (Horn, 1975) to find solution surfaces around a critical point. As he pointed 
out, there are integration problems as one gets far from the original curve due to 
the build-up of quantization and integration errors. As a result Horn and Brooks 
(Horn and Brooks, 1986) examined various regularization techniques designed to use 
more of the image data at once to generate a solution surface, but did not use the 
characteristic curves. 

There is another method for finding some of the invariant solution surfaces corre- 
sponding to solutions of the shape from shading problem. In this section we discuss 
the intuition behind a theorem called the Lambda Lemma, or the Inclination Theo- 
rem. If we take an initial manifold of the same dimension as the unstable manifold 
and transverse to the stable manifold of a critical point and deform the initial mani- 
fold using the flow of the dynamical system, then in the limit as time goes to infinity 
the Lambda Lemma asserts that the deformed surface will approach the unstable 
manifold in a C 1 manner (Palis and de Melo, 1982). We discuss two different im- 
plementations of this intuition to find the unstable manifold of an image dynamical 
system near a critical point, and show how these methods are affected by noise in 
the image and errors in the reflectance function: they turn out to be fairly robust. 

5.3.1 The Intuition 

The unstable manifold of the critical point of a dynamical system consists of 
trajectories that seem to emerge from the critical point at t — -co and continue 
away from the critical point. Figure 5.8 gives an example of a hyperbolic critical 
point (i.e. a critical point whose linearization has only real eigenvalues) in two 
dimensions; the unstable manifold consists of the two trajectories that approach the 
critical point with decreasing time. 

Other trajectories starting near the unstable manifold and near the critical point 
approach the unstable manifold for some time within some neighborhood of the 
critical point. If we take an initial manifold S° which crosses the stable manifold 
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Figure 5.8. Two dimensional hyperbolic critical point. 




Figure 5.9. Initial manifold is S°; S> are the deformations of 5° by the flow of the dynamical system. 



(Figure 5.9), we can watch what happens as the initial surface is deformed by moving 
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each point of the surface along the flow of the dynamical system, as a stream of water 
might move particles suspended in it. Since the surface intersects the stable manifold, 
at least one point on the deformed surface begins to approach the critical point. In 
addition, the behavior of trajectories near the unstable manifold is an exponential 
approach to the unstable manifold within a certain neighborhood of the critical point; 
as the initial surface is stretched and deformed by the flow, more and more of it will 
be stretched and pushed down towards the unstable manifold by the flow lines. In 
the limit, this deformed surface will become the unstable manifold. Although the 
example shown is in two dimensions, the intuition holds more generally. A proof 
of the Lambda Lemma comes by looking at the deformation of an initial surface 
carefully and taking limits (Palis and de Melo, 1982). 

The basic intuition behind the example and the theorem is to begin with an initial 
surface that cuts the stable manifold transversely, and then deform it through time 
using the flow of the dynamical system. We have implemented this as an algorithm 
on a highly parallel computer called the Connection Machine in two different ways: 
in the first case, we look mathematically at how the surface is deformed by the flow 
without following points on the surface; in the second case, we actually follow a grid 
of points on the initial surface as the flow moves them around. 

5.3.2 Fixed Grid Algorithm 

The Connection Machine is a highly parallel computer consisting of thousands 
of simple processors and a very fast disk drive system which can emulate a grid of 
processors. We used a 16k CM-1, which has 4k of memory for each of 16k processors. 
The kind of algorithms we will propose are extremely well suited to this kind of 
architecture with a SIMD language called *LISP as interface, since each processor 
can be assigned to a pixel and operates in a neighborhood of that pixel in parallel 
with all the other processors. 

We configured the processors as a 128 x 128 grid. For the fixed grid algorithm, 
each processor can be thought of as stuck to an unmoving (x, y) coordinate plane 
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parallel to the image plane. A two-dimensional surface in the four-dimensional space 
C(TR ,2) is defined above the plane by specifying (p,q) coordinates at each (x,y). 

5.3.2.1 Theory 

In determining how the initial surface is deformed by the image dynamical flow, 
we are interested in how the surface is deformed as a set, not necessarily in where 
each point is mapped by the flow. Knowing how the deformation process affects the 
heights of the surface above each mesh point is sufficient: it is a two-dimensional 
surface in a four-dimensional space, so we have both p(x,y) and q(x,y) as heights 
above the point (x, y). Assuming the initial surface and the flow are smooth, we can 
theoretically compute the small change in surface height above a fixed point due to a 
short flow along the vector field. We can use this to iterate the surface height above 
each mesh point and find how the surface deforms. 

To do this, we consider a slightly more general situation: say we have a time- 
dependent deforming surface S f defined by 

S* = {p\G t (p) = 0} 

for some G(t,p) where G : JR x £(IR 3 ,2) — ► IR 2 is smooth, with G t (p) = G(t,p) 
having the maximal rank of two as a function of p for all t: the surface S* is the level 
set at of the function G t . Assume we also have a surface P = {p\F(p) — 0}, where 
F is also smooth and of maximal rank. In our case, P — {(a, b,p, q)\p, q e IR} is the 
plane of possible (p, q) values for a fixed (x, y) = (a, b) in the usual coordinate chart. 
We are interested in the intersection of S* with P as time proceeds; this will give us 
(p, q) as a function of t above a fixed (a;, y) = (a,b). 

What happens to points in S* f"l P as a function of time? Let us consider a path 
a : IR — ► S* fl P that stays in the intersection of the two surfaces for some time 
interval around t = 0, with a(0) = po- This means that for all valid t, we have 

F(a(t)) = 

G t (a(t)) = 0. 
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We can look at the time derivatives to get some constraint on the time derivative of 
a, a'(0) = v, at t = 0: 



DF(v) = (f) 

±g) + D(G )(v) = 0. ($) 



In our case, we can write 



Gt(p) = 9 o 4>-t(p), 

where <f> t is the flow due to the characteristic vector field X, and S° = {p\g(p) = 0} 
is the initial surface we started with. 8 We have, using the chain rule and the fact 
that <j>t is the flow for X, 



(H**~ D "* M 



and 



(o,p) 



D(Go)(v) = Dg{v), 



since Gq = g o <j>o(p) = g(p). Substituting in to the second constraint (J) on v we 
have 

Dg(X) = Dg{w). 

In our case, F, Gt, and g all have rank 2. If we assume that F and Gt are 
maximally independent (i.e. the respective surfaces are transversal) at intersection 
points, then we should only have isolated intersection points at each t: we are inter- 
secting two two-dimensional surfaces in a four-dimensional space. This means the 
curve ct(t) in the intersection of the two surfaces is, in fact, a unique path, and it 
has tangent vector at t = given by v = (v x ,v y ,v p ,v q ), where v G Null(dF) and 
Dg(X) = Dg(v). 



8 So p is in 5* if and only if G t (p) = 0, i.e., if and only if g o <j>-t(p) = 0, i.e., the point at t = that 
flows to p at time t is in S = {p\g(p) = 0}. 
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With our particular coordinate choice, this is an easy system to solve: we have 
F(x,y,p,q) = (x,y), so the first condition (f) on v, DF(v) = 0, means that v x = 
v y = 0. If we write 

d(x, V, P, q) = (K x , V) ~ P, c(x, y) - q) 

which lets us think of the initial surface S° in C(IR 3 , 2) as parameterized by 

(r,s) i — >(r,s,b(r,s),c(r,s)), 

then we have 

b x by — 1 



L, x Uy 



Zy -1 



Dg = 

so the second condition on v, Dg(X) = Dg(v), can be written as 

b x X x -\- byXy — Xp = —Vp 

C X X X -j- CyXy — Xq = — Vq. 

In the usual (x,y,p,q) coordinate system used for the critical point analysis, we can 
replace in for the values of the characteristic vector field components to get: 

Vp — E X — b x Rp — byRq 

Vq ^ Ely C x itp Cyitq, 

v p and v q are the infinitessimal displacements at t = of the surface above a fixed 
(x,y) due to the dynamical flow deformation. We recalculate the partial derivatives 
b x , by, c x , c y where b(x,y) - p(x,y), c(x,y) = q(x,y) at each step of the iteration. 

We can perhaps view v p and v q as defining a vector field on the space of smooth 
two-dimensional surfaces in C(IR 3 ,2): at each t, we have a smooth surface which 
defines the derivatives of p and q (the derivatives of b and c respectively) with respect 
to the coordinate chart we have picked. These derivatives are combined together to 
give a smooth vector field which is, in some sense, the infinitessimal displacement of 
the entire surface. 
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5.3.2.2 Experiments 

We have implemented this as a discrete iterative method on the Connection 
Machine. We essentially tie the grid of processors to the (x, y) coordinate plane, 
with a processor at each integer crossing. At each processor we can keep values for 
the image gradients E x and E y as well as current estimates of the deformed surface 
heights p and q for each (x, y). We use an initial surface given by p(x, y) == q(x, y) = 0. 

We need to compute the "surface vector field" components v p and v q at each 
iteration: 

v p = E x - p x R p - p y R q 

v q = E y — q x R p — q y Rq, 
where p x , p y , q x , and q y are derivatives of the current p and q iterates. 

For both image intensities and p and q values, we use a simple first order method 
to compute the derivatives on the interior of the 128 x 128 square of processors with 
integer (x, y) coordinates for the processors: for example, if/ is the function for which 
we want a partial derivative, we take f x (x,y) w (l/2)(/(ar + l,y) - f(x - l,y)). At 
the four boundaries of the square grid, we cannot use this; we use the unbalanced 
estimates given by subtracting nearest neighbors where we have to. For example, 
at the left edge we take f x (Q,y) « /(l,y) - /(0,y); at the top we take f y (x,0) w 
f( x i 1) - /(z,0); at the corners of the grid, both f x and /,, are approximated this 
way. 9 We use this to estimate p x , p y , q x , q y , E x , and E y . 

For the reflectance derivatives, we use an analytic model to give us exact values 
for the derivatives. In our case, we assume a Lambertian reflectance function, 

R(p,q) = a - *+&-** 



Vp 2 + ? 2 + i v //? + /| + /i 



When we discuss the deforming grid algorithm in the next section, we must treat the border differently: 
we effectively interpolate the values found for the derivative from the interior of the image to the borders. 
Using the interpolating method in the fixed-grid algorithm does not change the performance substantially; 
noise immunity seems slightly worse. 
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where (h,h,h) gives the orientation of the light source and a is an albedo factor; 
for our purposes, we take a = 1.0. From this, we have 

a hq 2 + h- (ql 2 - h)p 



R P (p, <l) = 

R q (p,q) = 



y/q + q + q 0> 2 + <z 2 + 1) 3/2 

Q kp 2 + h- (ph - h)q 

y/q + q + q (p 2 + « 2 + 1) 3 / 2 ' 



and we compute R p and R q at each processor using the current values of p and q. 

To approximately deform the surface to match the dynamical system flow, we 
update the p and q values at each processor to p new and q new as follows: 

Pnew = P + hv p 
qnew = q + hv g , 

where h is a kind of integration step size if we consider the procedure as a parallel 
integration of a vector field. In the simple examples with which we have worked, we 
have left h constant over all processors. 

If we run just this algorithm we find that it does not converge. We have found 
it necessary to do a small amount of smoothing of the values of p and q to avoid 
"checkerboard" instabilities: small amounts of noise can explode into large alternat- 
ing sections of values if the p's and g's are not smoothed between iterations. This 
was also found by Ikeuchi and Horn (Ikeuchi and Horn, 1981), and we find good 
results by performing the simple averaging operation 

fnew(x,y) = (f(x + l,y) + f(x - l,y) + f(x, y + 1) + f(x,y- l))/4 

twice in succession on the p and q arrays. 

This internal smoothing operation can be a source of convergence errors, however. 
If we run the algorithm on a sphere centered in the image with radius 100 and light 
source located in direction (0, .5, -1), we can see what happens to the value of q 
at a particular point (64, 104) near the maximum brightness of the image. (Figure 
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5.10 shows constant brightness contours for such an image.) The correct q is —.4365 
and we take h = 1.0 as step size. Over the first 500 iterations, we see the value of 
q decreasing from its initial value of towards the correct value; as we expect, the 
value of v q is negative, but with decreasing absolute value during this approach. At 
around iteration 600, however, the value of q overshoots the correct value: we have 
q = —.4390, and, as we expect, v q turns slightly positive, trying to push q towards 
the correct value. However, the value of q continues to move negatively, converging 
to a value of about —.4505 at iteration 1600 with v q = 1.21 x 10 -4 still pointing 
towards the correct solution. 

If at this stage we change the algorithm to have just one internal averaging 
operation instead of two, we see q begin to move towards the correct solution again, 
but by iteration 800 with this new method, q has stalled at q = —.444 with v q = 
.626 x 10 -4 ; we have made up some of the distance to the true solution, but not all. 

Another way to see the effects of the internal smoothing operation is to begin 
with the correct solution surface and see where the algorithm takes it: by definition, 
the dynamical flow should leave an invariant manifold completely unmoved. If we do 
this with two internal smoothing operations, we see convergence to very nearly the 
same surface as for the (p, q) — (0, 0) initial surface: for example, after 800 iterations, 
q at (64, 104) appears to be converging to —.450 rather than staying at the original 
q = —.4365. If we do this with one internal smoothing operation, the solution starting 
from the invariant manifold is again consistent with that for the (p, q) — (0, 0) initial 
surface: again the value of q converged to at (64, 104) is q = —.444 rather than the 
original q = —.4365. 

From this we can conclude that the internal smoothing operation, although nu- 
merically useful, does slightly distort the final solution away from the true invariant 
manifold. With one internal smoothing operation, the errors in the converged p and 
q are less than 5% almost everywhere, and less than 2% in the image interior. Even 
with two internal smoothing operations, the errors are less than 4% through almost 



128 



Chapter 5 



Critical Points of the Image Dynamical System 




Figure 5.10. Constant brightness contours for sphere. Each stripe is 15 grey-levels wide. 




Mm 




Figure 5.11. Iterates for fixed grid shape from shading algorithm: a. 100 iterations, b. 200 iterations, 
c. 400 iterations; d. 800 iterations, e. p error image (255 * 10 * \p — ptrue\)- For 800 iterations: f. q error 
image (255 * 10 * \q — jtruel). 

all the image — where true p and q values are very near 0, of course, the relative errors 
are higher, and right at the borders of the image the relative errors are also higher 
(but never more than about 15%). We will stick with the double smoothing because 
of the noise immunity it seems to give later on. 

In Figure 5.10, we show the constant brightness contours for a noise-free 128 x 128 
simulated image of a sphere with radius 100, Lambertian reflectance function, and 
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Figure 5.12. Noisy sphere image: Noise maximum is ±.02. 



light source located at (0, .5, -1). In Figure 5.11 we show images from the approxi- 
mate solutions as the iterations proceed (step size h = 1.0): after 100 iterations, we 
use the current p and q values and the reflectance function to generate an image. As 
one can see, the images generated by the p and q estimates do show convergence to 
the original image through almost all of the interior of the image as expected. We 
have also generated images representing the errors in the p and q values: we take 
255 * 10 * \p — ptruel and 255 * 10 * \q - # t rue| and treat these as grey levels for an error 
image. A region that is just barely completely white (a grey level just reaching 255) 
would represent errors of .10 in p or q, corresponding to an error of about 6° in the 
surface normal at ptrue = 0, q tlue = 0, and less as p and q increase. 

If we add noise to the original image, we can degrade the performance of the 
algorithm. With the reflectance function we are using, the maximum brightness of 
the image is 1.0. If we add uncorrelated uniformly distributed noise to the image 
with maximum value .02 (meaning the noise is uniformly distributed between ±.02), 
we usually still get convergence; if we add noise with maximum value .12, we do not 
get convergence to a possible solution. Figure 5.12 shows a noisy image (via constant 
brightness contours) with .02 noise maximum; Figure 5.13a shows the image formed 
from the p and q arrays after 800 iterations with .02 noise, and Figure 5.13b and c 
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Figure 5.13. a. Estimated image from noisy images (±.02) after 800 iterations, b. p error image, c. q 
error image, d. Unconverged image for noise maximum .12 after 800 iterations. 

show the p and q error images. Figure 5.13d shows a failed effort with noise maximum 
.12. 

We can deal with more noise by reducing the step size, h. For example, with 
noise maximum .04, a step size of h = 1.0 converges only about half the time to a 
solution, 10 while a step size of h = .25 almost always converges. This is consistent 
with the view of this procedure as a parallel Euler integration of sorts: decreasing 
the step size makes the algorithm more likely to converge. 

Another way to deal with the uncorrelated noise we have added to the image is 
to pre-filter the image to remove some of the discontinuous influence of the noise. 
For example, with h = 1.0 we get convergent solutions about half the time with noise 
maximum .04; however, if we smooth the image four times with the simple averaging 
operation used to smooth the p and q arrays, we get convergence almost always. 

If we combine filtering and a smaller step size we can deal with even more noise. 
If we set the step size to h = .25, and pre-filter with one averaging operation, the 

That is, if we run the algorithm twenty times it converges on about ten of the trials. 
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brightness image on the left looks very chaotic because the noise is large compared 
to the constant brightness intervals even after pre-filtering. To the eye, the original 
image itself looks fuzzy, but is clearly interpretable as a smoothly curving object. 

Note that in general the algorithm does not converge to p and q values that 
could have given rise to the pre-filtered noisy image, e.g. some complicated faceted 
figure whose noise-free image would be the pre-filtered noisy image we are working 
with. Instead, due to the internal smoothing operations done on the p and q arrays 
within the algorithm, the algorithm appears to converge to a surface that is smooth 
compared to the image data. 

With this much noise, the converged p and q arrays do show considerable differ- 
ences from the original noise-free p and q values used to generate the images. For 
example, although the image in Figure 5.15a appears to be an adequate match for 
the original image, the p's and q's that generate that image vary randomly from the 
original image by a fair bit, as seen in the image of p and q errors in Figure 5.15 b 
and c. Indeed, one would be rightly suspicious that the p and q values converged to 
for noisy images are not an integrable set of normal vectors. We would need some 
kind of constrained method like Horn and Ikeuchi's or Frankot and Chellapa's to 
reconstruct the depth values from the more or less noisy p and q values. Note that 
the amount of noise in the p and q values is directly related to the amount of noise 
in the image. 

In fact, one of the interesting aspects of this algorithm is the lack of dependence 
on an explicit integrability constraint to find a solution surface. If we actually con- 
verge to the theoretical unstable manifold of a smooth image dynamical system, that 
unstable manifold is an integrable collection of normal vectors because it is made up 
of characteristic trajectories. When we add noise to the image and include numerical 
effects, we are effectively adding noise to the dynamical system, and the unstable 
manifold of the new dynamical system is no longer exactly integrable; however, if 
the noise is small, the normals will be very nearly integrable. 
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a- b. c. 

Figure 5.16. Integrability pictures for iterations of the noise-free image of a sphere, a. 200 iterations, b. 
400 iterations, c. 800 iterations. 

We can use an integrability measure to monitor the progress of the algorithm. 
In order for the collection of normal vectors to be normal vectors for a real surface 
in space, we must have p y = q x ; we can use \p y - q x \ as a measure of the integrability 
of the iterated surface and therefore as a measure of how close we might be to the 
correct unstable manifold of the image dynamical system. 

In Figure 5.16 we have tried to give a pictoral representation of this integrability 
measure (IM) in the noise- free case (Figure 5.11). The brightness values run from to 
255; these images are images of I0 6 \p y -q x | with truncation at values greater than 255, 
so that the white area area represents IM values of larger than 2.5 x 10~ 4 . The nearly 
black region (represented here as having very few white dots) have p y - q x values a 
small fraction (< 1%) of most of the p y values in the region. As iterations proceed, 
the regions of good integrability increase, although the edges of the image, where the 
converged constant brightness contours do not match the original (compare Figure 
5.10 with Figure 5.11d), remain with relatively high IM values. 

Adding noise to the image degrades the integrability measure of the iterated 
solutions. In Figure 5.17 we show some examples of integrability measure pictures 
under noisy conditions: we look at the IM pictures for a particular image with .02 
maximum uniformly distributed noise added, with step size of h = 1.0 and four 
pre-filterings of the image: as the iterations proceed, the results become more and 
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Figure 5.17. Integrability measures in the case of noise: noise maximum .02, h= 1.0, 4 smoothings of the 
image, a. 800 iterations, b. IM image after 800 iterations, c. 1600 iterations, d. IM image after 1600 
iterations. 

more integrable. This is consistent with the theoretical view that we are approach- 
ing the unstable manifold of the image dynamical system which should be perfectly 
integrable; in fact, however, progress is essentially halted after 1600 iterations: nu- 
merically, almost no more changes in the solutions or in the IM image occur. Figure 
5.15d shows the integrability picture for noise maximum of .25; clearly, more noise 
leads to worse integrability. 

We can also examine the sensitivity of the algorithm to changes in the reflectance 
function. In Figures 5.18 to 5.20, we explore what happens if we make errors in the 
direction of the light source assumed to have generated the image. We assume differ- 
ent values for h, where (h, l 2 ,k) = (0, .5,-1) is the original light source direction. 
As the error gets worse, the convergence of the algorithm after 800 iterations is 
not as complete; this can also be seen in the integrability pictures in Figure 5.20. 
Nonetheless, the arrays of p and q reached do generate images that correspond to 
the original over large areas with the assumed reflectance function (Figure 5.18); 
however, the surfaces themselves are increasingly different from the correct surface, 
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a. b. 

Figure 5.18. Incorrect reflectance function: images of p and q arrays with assumed reflectance function. 
Noise-free, 800 iterations, h = 1.0, correct l\ = 0.0. a. h = -10. b. Zi = .5. 





Figure 5.19. Incorrect reflectance function: images of p and q arrays with correct reflectance function. 
Noise-free, 800 iterations, h = 1.0, correct h = 0.0. a. h = .10. b. h = .5. Compare with original image, 
Figure 5.10. 

as can be seen from the images generated by the converged surface and the correct 
reflectance function (Figure 5.19). 

The fixed grid method appears to generate good solutions even in the presence of 
noise and seems to degrade gracefully if assumptions about the light source direction 
are incorrect. This is consistent with the theoretical underpinnings of the method: 
noise in the image or a poor choice of reflectance function represent perturbations 
of the original image dynamical system. Since the usual good critical point we deal 
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a. 



b. 



Figure 5.20. Incorrect reflectance function: Integrability measure pictures. Noise-free, 800 iterations, 
h = 1.0, correct h = 0.0. a. h = .10. b. h = .5. 

with is hyperbolic and therefore generic, perturbing the dynamical system a small 
amount moves the critical point a small amount and changes the invariant manifolds 
a small amount: the unstable manifold of the critical point is a stable feature (in 
this sense) of the dynamical system. 

This particular implementation shares difficulties with the simple Euler method 
of integrating a vector field, which sequentially adds a scalar multiple of the vector 
field to a point on the developing trajectory to generate the next point. For example, 
if the step size here is too big, the method may bounce around and not converge 
at all. Smaller step sizes avoid this problem, but too small a step makes for slow 
progress; some kind of adaptive step size setting would be useful. Note that because 
of the smoothing of the p and q arrays as part of each iteration step, taking an 
exceptionally small step size effectively allows more smoothing and less deforming 
along the dynamical flow. There may be analogs to the Runge-Kutta methods that 
make efficient use of several evaluations of the vector field to improve the estimate 
of the next point on the trajectory. 

A feature that may be exploited is the fact that we are really interested in the 
characteristic paths, not the trajectories. This suggests we could use a different 
(positive) h(x,y) to multiply the vector components v p and v g for each point (x,y); 
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Figure 5.21. Stretching of the mesh around a critical point p c . a. Before flowing, b. After flowing. 

this would change how the initial surface is deformed, but would not change the 
surface that is converged to in the limit. The limit is determined by the characteristic 
paths and the directions they are traversed, not by how fast the paths are traversed 
through time; changing the magnitudes of v p and v q by a (positive) scalar at each 
point of the image should not change the theoretical limiting behavior, and may be 
useful in dealing with difficult regions. 

The integrability measure provides a way to judge the progress of the conver- 
gence; changing the step size within regions that are having difficulty may improve 
convergence; and pre-filtering the image with appropriately sized filters can reduce 
the effects of noise. These all contribute to finding the unstable manifold of an image 
dynamical system critical point in a noisy environment. More work needs to be done 
to integrate these different features of the algorithm into a system for finding the 
unstable manifolds of image critical points. 

5.3.3 Deforming Grid Algorithm 

5.3.3.1 Theory 

We have also implemented a more direct version of the theoretical idea. The 
basic idea is to place the mesh of Connection Machine processors on the estimated 
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surface in C(IR 3 , 2), and move each mesh point with the flow of the image dynamical 
system. As can be seen from Figure 5.21, we will have an increasingly less dense 
concentration of processors near the critical point as time progresses. We need to 
continually fill in this region in some manner to maintain accuracy. 

We have chosen to use a very short flow duration so that the distortion of the 
mesh is fairly minor. After this flow, we reset the mesh on the new surface: the 
original mesh is located at integer (x, y) coordinate pairs (the p and q values are 
not restricted). After the short flow, the mesh is located at various slightly altered 
non-integer (x, y) positions. We compute affine approximations to the surface near 
integer (x,y) positions using this distorted mesh, and then fill in the p(x,y) and 
q(x, y) values in the new integer mesh. This new mesh sits on the deformed surface 
and is uniformly distributed around the critical point; we can now flow this mesh 
and repeat. 

To accomplish the flow, we use the image dynamical system vector field: 

v x = R p 

V y = R q 

v p = E x 

Vg = Ey, 

and update the values of x,y, p, and q in each processor as follows: 

•'•new = *£ ~r hVx 
Vnew —y + hv y 
Pnew = P + hv p 
Qnew = q + hv q , 

where again h is a kind of integration step size for the flow. We compute R p , R q 
analytically as in the fixed grid algorithm; E x , and E y we compute in the interior of 
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the image using a slightly different gradient operator than in the fixed grid case: as 
before, on the interior we take 

f x (x,y) = (l/2)(/(s + l,y) - f(x - l,t/)) 

and similarly for f y ; however, the previous method does not yield satisfactory results 
at the borders. Instead, we interpolate the values of the gradients at the border from 
the values in the interior: 

f x (0,y) = 2f x (l,y)-f x (2,y) 

f x (x,0) = 2f x (x,l)-f x (x,2) 
/*(127,y) = 2/,(126,y)-/ x (125,y) 
f x (x, 127) = 2f x (x, 126) - f x (x, 125). 

To find the new mesh values and reset the grid on the deformed surface, we use 
a nine-member neighborhood of each deformed point to estimate a tangent plane 
through that point. To do this, we use a standard least squares approximation 
technique. We normalize the neighborhood to have average x, y, p and q values, 
so that effectively we are working a least squares approximation near the origin. We 
solve for p's and g's separately. Letting / represent either p or q, we set 

* - 

•Aj ■ ~ ~ Jb I ^~ Jb 

y*% =yt-y 

Ji = x i — /? 

where Xi, yi, and /,• are the coordinates of the nine different data points in the 
neighborhood, and x, y, f are the average values of those coordinates over the neigh- 
borhood. We can define 

X = [X 1 , . . . , Xg) 

y = G/iV..,</9*) T 
f = (/iV--,/ 9 *) T 
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to be column vectors of the data coordinates. We seek a fixed column vector of 
coefficients /3 = {fi\,fl2) T such that we minimize 

ll[x,y]/?-f|| 2 , 

where [x, y] is the 9x2 matrix of «,- and yj values. This is the least squares problem 
of finding /?i and /% to approximately fit the linear function f(x,y) — fi\x + fiiV 
to the (normalized) data we have. The usual vector derivative condition on /? for 
finding the minimum of this expression is 

= 2[x,y] T [x,y]/?-2[x,y] T f; 

this can be rearranged to give 

" x T f 



P = 



T T 

x 1 x x J y 

T T 

y x y 1 y 



-l 



y r f 



/? = 



as the least squares estimate of the coefficients. Writing this in the usual summation 
notation, we have 

For each nine-point neighborhood of deformed points, we find fi = (/?i, /%) r this 
way. We then find the closest (x, y) integer coordinates to (x, y); say these are (m, n). 
We set 

f(m,n) = /?i(m - x) + fo(n - y) + /. 

If there is more than one neighborhood with the same (m, n) coordinate, we take the 
average of these values. 

If we do not flow by very much, we will have found values for most of the integer 
coordinates in the new mesh in this way. There may still be some isolated curves of 
integer coordinates without new values: these are filled in using another least squares 
fit over the neighboring filled-in values. As we shall see, this gives us some difficulties 
in the integrability picture. 

Again the edges of the grid pose special problems: in this case, we constrain the 
neighborhoods around a point to be just those points of the theoretical nine-point 
neighborhood that still actually lie on the grid. This gives us at worst four points 
(in the corners) from which to determine the tangent plane to the surface. 
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Figure 5.22. Deforming grid algorithm, no noise: a. 200 iterations, b. 400 iterations, c. 800 iterations, 
d. p error image after 800 iterations, e. g error image after 800 iterations, f. IM picture for 800 iterations. 



5.3.3.2 Experiments 

Because of the intensive least squares computations, this method is considerably 
slower than the fixed-grid method: it requires about 13 minutes on a 16K CM-1 to 
perform 100 iterations; this is in contrast with around 30 seconds for 100 iterations of 
the fixed-grid algorithm. In the noise-free case, the results are improved. In Figure 
5.22 we see the results of the deforming grid algorithm over 400 iterations: the edges 
are much cleaner than with the fixed grid algorithm. We also see the IM picture 
at 400 iterations: most of the interior has very low \p y - q x | values (relative values 
around 1% or less) except for the two moustaches of white in the upper half of the 
image; these correspond to slight "cracks" in the solution (smoothed somewhat by 
the least squares filtering) due to the method of computing the integer coordinates: 
on either side of these cracks, the neighborhoods used to compute the tangent to the 
surface are very different, e.g. on one side of the crack the integer coordinates might 
be towards the upper left of the patch while on the other side of the crack the integer 
coordinates might be towards the lower right. 
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b. 






Figure 5.23. Deforming grid algorithm, noise with .01 max: a. 200 iterations, b. 400 iterations, c. 800 
iterations, d. p error image after 800 iterations, e. q error image after 800 iterations, f. IM picture for 800 
iterations. 

Note that this result is achieved without separate internal smoothing steps: the 
least squares method itself provides some smoothing. If we add internal smoothing 
in the form of two averaging operations on the p and q values as we did in the fixed 
grid case, we get the slightly distorted border edges that we saw in the fixed-grid 
case (e.g. Figure 5.11), and the moustaches in the integrability picture fade but do 
not disappear. 

If we add noise to the image without smoothing the image, we do not get good 
convergence. However, if we add internal smoothing of the p and q arrays at each 
iterations, we do get convergence: Figure 5.23 shows such a result, and the rest of 
the trials were done with the internal smoothing in place. With noise maximum 
values of .01, the deforming grid algorithm converges very well about 80% of the 
time; 11 results look quite acceptable after 600-800 iterations. The errors in the p and 
q values on the interior of the image seem to be running less than 5% for the p values 



That is, after ten attempted runs, about eight will have converged to a solution. 
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Figure 5.24. Deforming grid algorithm, noise with .04 max, h=.25: a. Generated image after 1200 
iterations, b. p error image, c. q error image, d. IM picture. 

and less than 15% for the q values on the interior of the image, and as we move away 
from the center of the image where small p y and q x values magnify errors (and the 
moustaches make their presence felt), \p y — q x \ stays mostly below 10% of the value 
of p y . To properly find a depth surface consistent with the p and q values we will 
have to use a method that can deal with the lack of integrability, but the values are 
not badly non-integrable. Note that the IM picture shares some resemblances with 
that for the fixed grid algorithm: the edges of the image again show poor IM values, 
and low IM values are associated with good matches between the original image data 
and the image generated from the estimated p and q values. If we increase the noise 
maximum to .02, we find convergence only about 20% of the time. 

Smoothing of the image is useful in the deforming grid algorithm as well. With 
noise maximum of .02, smoothing the image changes the convergence rate from about 
20% to just over half. Much more useful is a reduction in step size: all the above 
results were achieved using h = 1.0; if we drop to h = .25, we get dramatic improve- 
ments in noise immunity: with noise maximum of .04, for example, we get good 
convergence of the image all the time, as well as quite good integrability results 80% 
of the time without smoothing the image, and all the time with image smoothing. 
Even for noise maximum .08 the image appears to converge correctly 80% of the time 
without image smoothing; now, however, the IM pictures generally look completely 
white, meaning the p and q arrays converged to after 1200 iterations are much less 
integrable than in the lower noise conditions. In Figure 5.24 we show an example 
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b. 



Figure 5.25. Distorting grid algorithm: incorrect reflectance function: images of p and q arrays with 
assumed reflectance function. Noise-free, 400 iterations, h = 1.0, correct /i = 0.0. a. /i = .10. b. /i = .5. 




Figure 5.26. Deforming grid algorithm: incorrect reflectance function: images of p q arrays with correct 
reflectance function. Noise-free, 400 ite rations, h = 1.0, correct ^ = 0.0. a. li = .10. b. h = .5. 

of the convergence of the algorithm for step size h = .25 after 1200 iterations on an 
image with noise maximum of .04. 

If we begin with the wrong reflectance function, we get very similar behavior to 
the fixed grid algorithm: Figures 5.25 to 5.27 show essentially the same performance 
with the distorting grid algorithm as Figures 5.18 to 5.20 show for the fixed grid 
algorithm. As before, Figure 5.25 is formed from the converging p and q values with 
the assumed, incorrect reflectance function: as the error in the reflectance function 
increases, the convergence becomes slower, but even with h set to .5 instead of the 
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a. b. 

Figure 5.27. Defoiming grid algorithm: incorrect reflectance function: Integrability measure pictures. 
Noise-free, 400 iterations, h = 1.0, correct Ji = 0.0. a. li = .10. b. Ji = .5. 

true 0.0 much of the image is reconstructed. Figure 5.26 is formed using the true 
reflectance function, and shows how different the solution surface is from the correct 
one. Figure 5.27 shows the IM pictures; as the error in the reflectance function 
increases, the integrability measure becomes higher over certain edges of the image; 
however, much of the center of the image stays integrable. This algorithm as well 
seems robust to errors in the reflectance function. 

The deforming grid algorithm is considerably slower in our implementation than 
the fixed grid algorithm discussed in the last section. It does appear to converge to 
somewhat more integrable sets of normal vectors than the fixed grid method except 
for the occurrence of "cracks" in the integrability picture due to a sudden shift in 
the neighborhoods used to calculate tangent planes. It also seems quite robust in 
the face of noise and distortions in the reflectance function. 

The difficult part of doing a direct implementation of the Lambda Lemma intu- 
ition is filling in the parts of the deforming surface that are vacated by processors 
flowing away from the critical point. The deforming grid algorithm we implemented 
flows for a short distance, and then fills in a new rectangular grid consistent with 
the deformed grid; we have used a local least squares linear approximation to do 
the filling in using neighborhoods of the integer grid points. One might try to fit a 
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splined surface of higher order to the data points given by the distorted grid, and 
then get the rectangular grid values from this. 

5.3.4 Conclusions on Implementation 

The successful implementation of the Lambda Lemma intuition provides em- 
pirical support for the theoretical approach taken in the first two sections of this 
chapter. Convex/ concave solution surfaces around a maximum critical point in the 
image due to a critical point maximum of the reflectance function correspond to 
stable/ unstable invariant manifolds of the corresponding critical point in the image 
dynamical system. 

The algorithms derived from the Lambda Lemma intuition appear to be relatively 
stable both with respect to noise and with respect to errors in the reflectance function. 
In addition, there are several potentially useful properties of these algorithms. One 
feature is that they do not require the imposition of integrability at each iteration to 
find a solution. Integrability is a side-effect of a successful convergence: the unstable 
manifold of a critical point must obey both E = R everywhere on the surface (it is 
true at the critical point; all the trajectories of the unstable manifold approach this 
point as t goes to — oo; and H = E — R is constant on each trajectory), and the 
integrability condition, since the integrability condition is true on all characteristic 
trajectories and the invariant manifold is made up of these. One can use integrability 
of the evolving p and q values to monitor the progress of the algorithm, particularly 
in the presence of noise. Another feature is that we can pick step sizes for the 
iterations that are different at each point of the grid because the limiting behavior of 
the convergence is not dependent on the particular time-evolution. This may allow 
careful handling of tricky sections in an image. 

If an image critical point is on the stable manifold, we can reverse time and 
use the same techniques. If an image critical point is a saddle critical point (for 
example, at a maximum of the reflectance function and with saddle surface structure 
at that point), then the invariant manifold is neither a stable or unstable manifold 
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Figure 5.28. 2-d dynamical system restricted to the invariant manifold: in general, most points will be in 
the stable or unstable ma nifold of either a source or sink critical point. 

for the dynamical system, and the methods described here fail. On the other hand, 
this saddle invariant manifold has to come from somewhere. A true solution surface 
will have a two-dimensional restricted image dynamical system defined on its interior. 
This system will have critical points, and will (probably) be a generic two-dimensional 
system: this means that almost all the points on the surface will either be in the stable 
or unstable manifold of a source or sink of the two-dimensional system (Figure 5.28). 
Almost all the points near the saddle critical point in theory should be reachable by 
finding the stable and unstable manifolds of other critical points. 

The methods described in Sections 5.3.2 and 5.3.3 will only work on some (pos- 
sibly quite extended) neighborhood of a single critical point. If two critical points 
are contained in a neighborhood on which these algorithms are applied and the true 
solution surface is the stable manifold for one of the critical points and the unstable 
manifold for the other, it is not clear what solution (if any) will be converged to. The 
Lambda Lemma algorithms will seek simultaneously to find the unstable manifold 
for both critical points; this will most likely lead to non- convergence. 
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This suggests dividing the image into regions each centered on a single critical 
point. We then find the stable and unstable manifolds of each critical point out to 
the (artificial) boundary of its region, and look to see whether or not these surfaces 
can be seamlessly merged together. 

Treating the shape from shading problem as an image dynamical system leads 
to new ideas for shape from shading algorithms. These algorithms are very parallel 
in character, and converge robustly to integrable solution surfaces near the critical 
points without an externally imposed integrability condition. 
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A5.1 Similar signatures for b and a = p t bp. 

We are interested in the signs of the eigenvalues e*i, ai for the 2x2 matrix 
A = P T BP, where B and P are 2x2 matrices, B is symmetric and P is invertible. 
Since A is symmetric, we know it has real eigenvalues ct\,ct2. We are interested in 
the roots of the polynomial 

det(7 - \A) = det(7 - \P T BP). 

Pre-multiplying by P and post-multiplying by P -1 does not change the determinant, 
so we are interested in the roots of the polynomial 

det(7 - \PP T B). 

Since P is invertible, PP T is positive definite: we have 

u T PP T u = {P T u) T (P T u) > 0, 

and equality only occurs if P T u = 0, i.e. if u = 0. 

We can now use an argument similar to the one in Section 5.2.5: we let Q be a 
matrix of orthonormal eigenvectors for PP T , so that QAQ~* = PP T where A is a 
diagonal matrix of the eigenvalues 71 > and 72 > for PP T . We have 

det(PP T B) = a ia2 = detiQAQ^B) = det{KQ- x BQ). 
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B and Q~ 1 BQ, being similar, have the same eigenvalues fl\ and /?2- We have a\a 2 = 
7l72/?l/?2; the signs of «i and oli are the same if and only if the signs of fi\ and fa 
are the same. 

We can also look at 

trace(A) = ai + a 2 = tr&ce(P T BP) = tr&ce(PP T B) 
= tmceiQAQ^B) = trace(A<2 -1 BQ). 

If a and b are the diagonal entries of Q~ 1 BQ^ then we have a.\ + a 2 — 71a + 72 b. 
UB is positive definite (i.e. both /?i and /% posistive), then so is Q~ l BQ and hence 
a and 6 are both positive (e.g., a = (1,0)Q T BQ(1,0) T > 0). This means the sum 
<*i + 0:2 is also positive, and since a\ and oti are of the same sign, they must both be 
positive. Similarly, if /3i and /% are both negative, so are both cx\ and a 2 . Finally, if 
01 and 02 are of opposite sign, so are ai and c*2- 

A5.2 Eigenvalues of a = bc When b and c Are Sym- 
metric 

This is a pictorial look at the following problem (taken in large measure from 
(Norton, 1988)): say B and C are real 2x2 symmetric matrices. When does their 
product, A = 5C, have real eigenvalues? 

If A has one complex eigenvalue, then it has two, and det A > 0; thus if A has 
one real eigenvalue, all its eigenvalues are real. If either B or C is singular, then so 
is A, and hence A has one real eigenvalue, 0. 

Assume both B and C are invertible. We will demonstrate that if either B or C 
is definite, then A has real eigenvalues. Let us assume that B is definite. If C is not 
definite, then det A = det B det C is negative, and A must have real eigenvalues. 

Assume both B and C are definite. We can look at how B behaves by considering 
how B acts on the set of straight lines through the origin in the plane: this is 
also called the projective plane, IP 1 . We can identify lines through the origin with 
points on the circle: essentially, we draw a half circle in the plane and look at the 
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Figure A5.1. a. Lines through the origin, b. S 1 is diffeomorphic to IP 1 . 



intersection of each line through the origin with the circle; we identify the endpoints 
of the half circle (since the lines through these two points are the same), and this 
defines topologically a circle (Fig. A5.1). If p G IR 2 \0, let us define [p] to be the line 
through the origin containing the point p. B can be considered to act on IP 1 since 
BXp = \Bp, so [Bp] = [BXp]p and we can define B\p] = [Bp]. If v is an eigenvector 
for B, then Bv = \v, so B[v] = [v], and B leaves the line containing v unchanged. 
The line through the origin and v is a fixed point of B's action on IP 1 . 

If we take a line [p] G IP 1 we can find out where B k [p] goes as k goes to infinity. 
Assuming B is definite, we can pick an orthogonal basis of eigenvectors {t>i,t> 2 } such 

that in this basis 

"Ai 
A 2 

where Ai and A 2 are the eigenvalues of B with |Ai| > |A 2 |. (If A 2 = A 2 , then B is a 

multiple of the identity, and A has real eigenvalues immediately as a scalar multiple 
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a. 




Figure A5.2. Action of symmetric, definite B: a. Lines picture, b. S 1 ~ IP 1 
picture. B leaves the eigenspaces [v\] and [v 2 ] alone; if |Aj| > | A2 1 , then for p ^ v 2 , 
lim k ^ 00 B k [p\ = [t>i]. 

of C.) If B is definite, then Ai and A2 are the same sign; we can multiply B by — 1 
if necessary to get them positive without affecting the action on lines. In fact, the 
action of B on lines through the origin is unaffected by scaling B by any constant, 
so we can look at the action of 



we have 



B = 



B k = 



1 
A2/A1 



1 

(A 2 /A!)* 



From this we see that if p ^ u 2 , 



lim B k [p] = lim B k [p] = [ Vl ] 

k—hx> k—KX> 

since B will squash the u 2 component of p while leaving the v\ component fixed. 
Since v 2 is an eigenvector for B, lim^oo B h [v 2 ] = [v 2 ]. Looking at how B k acts on 
lines, we see that it moves all of them except [v 2 ] monotonically towards [v\] along 
the half-circle between [v\] and [v 2 ] containing p (Figure A5.2a). In Figure A5.2b, 
we show this on the circle representation of IP 1 by putting arrows along the circular 
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Figure A5.3. Combined action of definite B and C: circle representation, fa] and 
fa] are at opposite sides of the circle, and the dynamics of iterations of B is shown 
again by arrows. The eigenspaces [w\] and [w 2 ] for C are placed on the same circle. 
The combined map A = BC acts as follows: C leaves fai] alone, and B moves [wi\ 
towards fa]. A must have a fix ed point between A[u>i] and A[w 2 ]. 

arcs. In a sense, the point fa] on the circle S 1 ~ IP 1 is a sink for the iterations of 
B on IP 1 , while [^2] acts as a source. 

Now we look at the composition BC. C is symmetric, so it also has orthogonal 
eigenspaces [w{] and [w 2 ]. Let us assume these are different from fa] and fa] (oth- 
erwise B and C are simultaneously diagonalizeable; hence they commute; hence A 
is symmetric and has real eigenvalues). In Figure A5.3 we show how the combined 
action A = BC moves the points [w\] and [w 2 ]: it moves them both towards the 
point on the circle fa]. Since B and C are both definite, they preserve the order 
of points around the circle, and so does A = BC. As a result, A must map the 
arc between [twi] and [w 2 ] containing fa] into the smaller arc between A[w x ] and 
A[w 2 ]. This means A has a fixed point on this interval. To see this, pick a coordi- 
nate chart for the circle containing the interval, and say #i is the coordinate on the 
circle for fai] and 6 2 is the coordinate on the circle for [w 2 ]. If we define the function 
g{9) = A{9) - 9 (where we think of A{6) as the coordinatized version of A), then 
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g{9\) and g(9 2 ) are of opposite sign. By the intermediate value theorem, there must 
be a 6 € [$i, 2 ] such that g(9) = 0, so there must be a [p] between A[wi] and A[w 2 ] 
such that A[p] = [p]; i.e., A has a fixed point. A fixed point for A = BC acting on 
lines means that A has a real eigenspace, and hence real eigenvalues. 

If B and C are both not definite, then this argument does not work. For example, 
if 

B = 
C = 



then 



B k = 



1 


' 





-1 


"0 


r 




1 





? 


"l 








- 1 *. 



1 
-1 



so that B [v] will never converge. In this case 

A = BC 
is a rotation matrix which does have complex eigenvalues. 

A5.3 Eigenvalues of a = bc, c = q t bq, b symmetric. 

In Section 5.2.4 we are interested in the case where B = D 2 R, and C - D 2 E; 
from Section 5.2.2, we know that C = Q T BQ and hence (Appendix A5.1) has the 
same signature as B. We want to know when A has positive eigenvalues. We can 
once again use an argument similar to that in Section 5.2.5. 

From Appendix A5.2, we know that if B (and therefore C) is definite, A has real 
roots. Assume B is definite. We have 

det(A) = ai a 2 = det(BC) = A/?27i72, 

where /?i and /? 2 are the eigenvalues for B and ~n and 72 are the eigenvalues for C. 
Since B and C have the same signature, 0:10:2 is poistive, so ot\ and a 2 have the 
same sign. Picking coordinates to diagonalize B, we can write 

trace(y4) = a\ + a 2 = trace BC = fi\a + j3 2 b, 
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where a and b are the diagonal entries of C in this coordinate system. If B (and 
hence C) is positive definite, then all the values /?i,/?2,a, b are positive, and so a\ 
and «2 must both be positive; if B is negative definite, all the values /?i, /%, a, 6 are 
negative, and a.\ and a^ are again both positive. Thus, if B is definite, A has positive 
real roots. 

If B is indefinite, this result does not hold. The example at the end of Appendix 
A5.2 shows that if B and C are both indefinite, BC can have complex eigenvalues. 
If 



then 



has negative eigenvalues. 
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In Chapter 5 we discussed the contribution of critical points to the determination 
of shape from shading. This chapter discusses the contribution of the image data 
near and at the bounding contour. A motivating example is presented, followed by 
two different analyses, the first a linearization analysis of the characteristic vector 
field at the bounding contour, the second a power series approach. These suggest 
that given a known reflectance function, a patch of bounding contour image data is 
no more useful than a patch of data from the interior of the image not containing a 
critical point: different solution surfaces can be determined by the choice of depth 
values along a curve in the image, in particular the bounding contour. This reinforces 
the theoretical importance of the critical points on the interior of the image as the 
major determinants of solution surfaces. 

6.1 Introduction 

There are two problems with working (either theoretically or practically) with 
the bounding contour image information: the H c : (x,y,z,p,q) i — ► (x,y) rectilinear 
coordinate representation for the surface, the image, and the image projection does 
not have p and q well-defined at the bounding contour; and, although the image 
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Figure 6.1. The turned standard coordinate system for viewing the bounding contour. The image plane 
is now the y — z plane. 

intensity is theoretically well-defined at the bounding contour, the image brightness 
derivatives explode. 

The first problem is surmountable: our approach has been to try to think in 
a coordinate-free way and choose coordinates to answer particular questions. In 
this case, we can turn the coordinates sideways. We use an (x, y,z,p, q) coordinate 
system as before, with tangent spaces coordinatized by p and q, where 



"l 


0" 





1 


.p 


?_ 



spans a tangent space of interest at the point (x,y,z) in space, and the matrix 
is written using vectors ^-, ■$-, ^ as a basis. However, now we take the image 
projection to be 

7T 7 : (x,y,z) — >(y,z). 
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Effectively, we are projecting down the x direction, and points in C(IR , 2) on the 
bounding contour all have coordinate p = since by definition the ^ direction is 
contained in a surface tangent plane at the bounding contour. 

We need to recalculate the coordinate expression for the vector field X defining 
the characteristic dynamical system. In Chapter 4 we defined the image dynamical 
system function as H = E o 7r — R, where it is the projection from the five 
dimensional space C(IR , 2) to the image, and R is the reflectance function defined 
on C(JR ,2). We discussed how finding solution surfaces for the first order partial 
differential image irradiance equation can be viewed as finding a solution surface for 
a differential ideal generated by dH and 0, where is the contact 1-form defined 
in our coordinate system as = dz — pdx — qdy. This in turn gave rise to a closed 
ideal with the same solution surfaces, generated by dH,0, and d0. This ideal has 
one-dimensional solution curves, the characteristics, that can be assembled together 
to make up solution surfaces for the original first order equation. These characteristic 
curves can be defined by a characteristic vector field X. A characteristic vector field 
for an ideal of differential forms is one that preserves the ideal under contraction: 
i.e., if p is a differential form in the ideal, then \xp is in the ideal as well. Thus 

\ x d0 = a + /3dH, 

since these last two generate all the 1 -forms in the image ideal. We can expand both 
sides with our new coordinate system: we have H = E{y,z) — R(x,y,z,p,q) and 
collecting terms we get 

X x dp — Xpdx + Xydq — Xgdy = 

{-ap - $R x )dx + (-aq + f3E y - /3R y )dy + (a + /3E Z - f3R z )dz 
- j3R p dp - PRqdq, 
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where we have allowed R to be space varying, as we will need the results later on. We 
can collect coefficients of the basis 1-forms die, dy, dz, dp, dq to get a = —j3{E z — R z ), 
and hence (using ixO = X z — pX x — qX y — to get the components of X z ): 



X = -/3 



Rp 

Rq 



pRp + qR q 

-R x + p(E z - R z ) 

Ey — R y + q(E z — R z ) 



Even in the space invariant case, R x — R y = R z = 0, this vector field "looks" 
different from the characteristic vector field used to analyze the critical point case in 
Chapter 5; this is because the coordinates have been turned to allow the bounding 
contour to be contained in the coordinate chart. 

The other problem with working at the bounding contour, the fact that deriva- 
tives of the image intensities explode as shown in Section 3.2.1, is more difficult to 
handle. Infinities keep cropping up when dealing with the bounding contour. Our 
analysis in Chapter 5 of the critical point case depended on the dynamical system 
generated by a vector field defined on C(IR 3 ,2). This vector field depends on the 
image derivatives no matter what the coordinate system, as can be seen from the ex- 
pression derived in our new coordinate system in the previous paragraph, and these 
image derivatives blow up in general at the bounding contour. The analogy with the 
behavior of y/x as x goes to zero is very strong, as we shall see. 

6.2 A Motivating Example 

To begin to get an understanding of what might happen at the bounding contour, 
we can work through an example. Consider a surface defined by 

z = ^(ax 2 + by 2 ), 

a paraboloid. (Figure 6.2) At each point (x, y, z) on the surface, we will have 

P = z x = ax 

q = z y = by. 
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\ 



v 





Figure 6.2. Example of bounding contour analysis: a paraboloid. The y — z plane is the projection plane. 

Consider also a very simple reflectance function, 

R(p,q) = Ph + qh- 
The image in our turned coordinate system will be 

E(y, z) = hax(y, z) + l 2 by; 

note that we must express a; as a function of the image coordinates (y, z) in order to 

define the image. Since 

1 . 

x = -\ — 7 =\/2z - by 2 , 
\Ja 

assuming we are interested only in the view of the positive x sheet of the surface 

(remember that we are projecting along the x direction), we have 



l\a 



E{y,z) = ±=y/2z-by* + l 2 by. 
yja 
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The bounding contour in the image will be places in the image (the y-z plane) 
where the surface has p = 0; i.e. places where x(y,z) = {\j y/a^y/^z — by 2 = 0. 
These points lie on the parabola z = \by 2 in the image. 

We can find the components of the characteristic vector field using the expressions 
developed last section and taking j3 = — 1: 



X = 



x 

y 

z 

i> 

Iql 




y/2z - by 2 



As the bounding contour is approached, both p and y/2z — by 2 approach 0; thus, the 
vector field is undefined at the bounding contour. 

The behavior is similar to the behavior of the vector field X = (x/y, 1) T on IR 2 : 
this two dimensional vector field is also undefined at the origin. In this case, we 
can get out of difficulty by multiplying the vector field by the scalar y to get a new 
vector field parallel everywhere (except at the origin) to the old one and given by 
X = (x, y) T . This is well behaved on the plane, but has a critical point at the origin. 

We can multiply the characteristic vector field of our image dynamical system 
by v = \Jlz — by 2 and consider instead the o.d.e 



x~\ I" l\v 

V hv 

z = v(ph + qh) 

P phy/a 

iql lhbv + li^/a(q-by). 

Since p = 0, v = 0, and q — by = at the bounding contour, from the expression 
for our vector field we see that the bounding contour is a curve of critical points. 
The multiplication by v has made the vector field more tractable. If we replace the 
z equation with one for v, and substitute z = (l/2)(v 2 + by 2 ), we will have made a 



162 



Chapter 6 



Image Data at the Bounding Contour 



new related vector field that is actually Lipschitz at the bounding contour. We have 
z = {\/2){v 2 + by 2 ), so 

v(pl\ + qh) = z = vv + byy. 

Using the vector field for y, we can solve for v to get the vector field 



~ x~ 




l\V 


y 

i) 

P 

-9- 


= 


ph + qh - bl 2 y 

phy/a 

J2bv + liy/a(q- by) 



We can solve for p(t) = p exp(/iy / ai). We also note that 

q = l 2 bv + hy/a(q - by) = by + hVa(q - by), 



so 



—(q - by) = hVa(q - by), 



and we have 



q — by — co exp(l\\/at). 
We also have v 2 = 2z — by 2 , so 

vv = z — byy 

= v(ph + qh) - hk/v 
v = ph + l 2 {q - by) 
= hpo exp(li\/at) + /2C0 exp(li\/at) 
= (hPo + hco) exp(liy/at); 



hence 



. hpo + hco, n /-,, 1X 

v = vq -\ — -= — (exp(hVat) - 1). 

«l y/a 



From this and the vector field we can solve for x and y: 
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x(t) = d + voht + h 1 ™*' 20 * ( eM^ft) _ t ) 

y(t) = e + v l 2 t + l 2 — t—= — — , X- - t) . 
Finally, we can solve for 

q(t) = (q- by)(t) + by(t) 

= co expfr vfc) + 6e + bv t + ^ + fr»> ( **P(hft) _ \ 

«iv« V *iv« / 

z(t) = (l/2)(<; 2 + by 2 ). 

We can also find the implicit equations of the characteristic curves from the above 
parameterized equations: we have p(t) = poexj>(Iiy/at), so 



t = T77a zln( ' p / p() ^ 



and therefore 



q = by+ -co 
Po 

, hpo + hco 
v = v + ((p/po) - 1) 

x = d ° + hvo ]^/E ln ^P Q> > + l i hP °p a hC ° ((p/Po) - H(P/P0))) 
y = e + l 2 v -L= ln(p/po) + h hn i l2C \ {plp,) - Hp/Po)). 

We can rearrange and write these equations in a different way. We can define 
functions ?7i,...,774 as real functions on C(ffi. 3 ,2) that have a constant value on a 
trajectory: they are called integral invariants of the flow defined by the vector field. 
Each equation r)i(x,y,v,p,q) = m defines a hypersurface in C(IR 3 ,2); on the interior 
of the image (i.e. away from p = 0) these will be transverse to each other, so that 
the intersection of all four hypersurfaces in the five dimensional C(IR 3 , 2) will be a 
one dimensional curve, the path of the characteristic trajectory. 

We begin by defining 
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Vi - , 

P 

and we see from the above implicit equations that on a characteristic curve, we have 

co 
rji = — = «i. 

Po 
From the implicit equation for v, we have 

. hPo + hco ( p 

V = VQ+ - ■= 1 

'1V« \Po 

h \/a ^1 y/a 

where we have used the fact that q - by = c p/po. We can define 

h\/a 

and we have t/ 2 = «2 along a characteristic curve. If this is compared with the 
expressions for E(y, z) and R(p, q) early in the example, it can be seen that n 2 - 
H/(l ly /a), where H = E(y, z) - R(p, q) is the coordinate representation of the func- 
tion H : C(TR , 2) -> IR that defines the image dynamical system. We know that H 
must be a constant on trajectories of the system; in this example, it has explicitly 
emerged. 

We can simplify the coordinate representation of n 2 by explicitly using the defi- 
nition of 7/i : 

p(h + km) 

rj 2 =v . 

l\\/a 

We can perform the same kind of rearrangement and substitution on the implicit 
equation for x to get 

rn = x-h(v- hllMl^M) hl_ _ j hp + hjq-by) 
V hVa ) l\\fa 1 l\a 

= x - /11727 — 7= - 7— p(v - m), 

k v a hV a 
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with, again, 7/3 = K3 along a characteristic curve. Finally, rather than using the 
equation for y to generate a fourth invariant, we can observe (either from the original 
differential equation or from the implicit solutions themselves) that 

7/4 = l 2 x - hy 

is a constant along the characteristic trajectories; this and the previous implicit 
equation defining 7/3 give the implicit equation for y. 
We write these four invariants here for reference: 

q-by 



m = 



rj 2 = v- 



P 

p(h + hm) 



h\/a 



m = x — 7=( m P + 1) — 7= 

y/a y/a 

?/4 = I2X — l\y. 

Given p^O, and the constants «,-, we can solve sequentially for the other values 
x,y,v,q : 



p(h + I2K1) 

v = k 2 + ._ 

l\y/a 

x = K3 + ^ {lnp+1) + JL (t) 

1 /2 

y = -7-/C4 + —x 
n n 

q = by + pm. 

For p^O, the four invariants rji — «,- describe the one dimensional characteristic 
curves. 

These equations define the characteristic curves for this image dynamical system. 
We are interested in behavior near the bounding contour, places where p = in 
our coordinate system. We can, for example, find out which contours will actually 
approach the bounding contour by examining what happens with the invariants as 
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p approaches 0. From the definition of 771, assuming a finite «i, we must have 
q — by — ► too. From the definition of 772 we also see that v — > k 2 . From the 
definition of 773, in order for K3 to be finite, we must have k 2 = 0; since 7/2 is an 
invariant of our characteristics, it is not a question of 772 approaching zero along a 
characteristic; we must have 772 = «2 identically equal to along the characteristic 
in order for /c 3 to be finite. This forces v — ► and x — > K3 as p — > 0. Finally, we have 
y-> Mk) - (l 2 /h)K 3 

The assumption that the bounding contour of the image in this example is due 
to the projection of finite points on characteristic trajectories has forced 772 = on 
trajectories that satisfy this. Since 772 = H, the image dynamical system function, 
this geometrical assumption has restricted us precisely to the characteristics that are 
consistent with the image: those with H = 0, i.e. with E o Il c = R along them. 

Consider a curve in space parameterized as 

s^->(x°(s),s,^s 2 ,0,bs), 

with x°(s) an arbitrary function. This curve will project to the image of the bounding 
contour of our simple surface, and also has the same normal along the bounding 
contour in space as our original surface. Which characteristics will approach this 
curve? We can parameterize the set of characteristics by s where («i, . . . , k^)(s) for 
a fixed s describes a characteristic that approaches the point (x°(s), s, (l/2)s 2 , 0, bs) 
on the bounding contour. From the above calculation as p -► 0, we see that 

K 3 (s) = X°(s) 

«4(s) = /2X°(s) - hs 

K 2 (s) = 

and apparently arbitrary ki(s) will give characteristic contours that approach the 
given bounding contour in C(]R 3 ,2). 

The degeneracy of the image at the bounding contour is shown again by the 
fact that k\ appears to be arbitrary in determining characteristics that approach 
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a point on the bounding contour. There appears to be a one-parameter family of 
characteristic contours (with k\ as parameter) that approach a given point on the 
contour. 

However, from the equations (f) giving x,y,v, and q in terms of p for a given 
choice of the /c,, we see that the space coordinates x, y, and v (and hence z) are 
determined solely by «2 = 0, K3, and K4; the "ambiguous" K\ does not appear to enter 
in the determination of the space part of any of the characteristic curves approaching 
the bounding contour. This leads one to suspect that in this example k\ must in 
some way be determined by the integrability condition dz = pdx + qdy. 

In the interior of the image, the choice of an initial strip determines the values 
of both H and the contact 1-form 6 at each initial point along the strip; these values 
are constant along each (unique) characteristic emanating from that initial point. In 
our example, at the bounding contour we have a more degenerate situation: we have 
a one-parameter family of characteristics emanating (in the limit) from each point on 
the bounding contour. Using z = (l/2)(u 2 + y 2 ), the integrability constraint along 
the bounding contour is 

z = vv + byy = px' + qy' 
byy' = byy', 

where x', etc. are derivatives with respect to s along the bounding contour, and we 
use the fact that v = p — and q = by along the bounding contour to get the second 
equation. Thus, the integrability condition z' =■ px' + qy' is apparently satisfied for 
the whole 1-parameter family of different trajectories emanating from a single point 
on the bounding contour. This gives us the apparent ambiguity in k\. 

The calculation of the integrabilty equation z' = px' + qy' at the bounding 
contour is not quite justified, however, since at the bounding contour our invariant 
characterization 7/, = k,- of the characteristic trajectories breaks down. We have to 
be more careful with the limiting behavior of the integrability condition. 
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We can examine the integrability condition on the interior of the image near the 
bounding contour along a curve (x, j/, i>, p, q)(s) that runs transverse to the character- 
istics. We have functions Ki(s) denned as the characteristics are crossed. We know 
that in order for these characteristics to approach the bounding contour in the limit 
we must have K2(s) = for all s. We can write the integrability condition using the 
flow invariants 77, = /c; and substituting for v, v', and y' using different equations 
from (f): 

z' = vv' + byy' = px' + qy' 

vv' = px' + (q — by)y' = p(x' + K\y') 

Solving for «i, 

1 

«1 = «3 



(l//l)/C 4 '-(/ 2 //l)«3'' 

Thus, «i is determined by the integrability condition on the interior of the image 
together with derivatives of the curves ks(s) and ac 4 (s). We can take the limit of this 
as our transverse contour approaches the bounding contour: in this case, we have 
x(s) = x°(s), k 3 (s) = x°(s), and /c 4 (s) = l 2 x°(s) - ^s, and so we see that 

Ki(s) = —x° (s). 

This is another sign of the degeneracy at the bounding contour: we need to pass to 
the derivative of the bounding contour data to actually tie down /ci(s), in contrast 
to the image interior where /ci(s) is determined just by the values on the initial strip 
at each s. 

Note that a large family of solution surfaces gives the same image as our original 
paraboloid: we can pick any initial depth curve x°(s) along the bounding contour 
and find a solution surface containing it. The equation for the solution surface can 
be found from the equations (f ) relating x and y to v and the invariant coefficients 
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«,-. Since v — \j2z — by 2 , by squaring and solving we can find an expression for z 
as a function of x and y along the characteristic trajectory fixed by s. From the 
equation relating y and x we can solve for s as a function of y and x (except perhaps 
where x°(s) has zero derivative) and substitute this into the preceding equation to 
get the general equation of the surface. This surface will not fold over itself (except 
where a; does), and so describes a solution surface consistent with the image data 
and reflectance function assumed. 

For example, we can assume x°(s) = as. Then 

k\(s) = —a 
k 2 (s) = 
/c 3 (s) = as 
Kt(s) = (l 2 a - /i)5, 



and 



We can solve for s: 



and we can write 



v _ P(h - ha) 

x = as + (v/yfa) 

h (has — lis) 

v = h x 5 

q = by — pa. 



_ x - (v/y/a) 

s — , 

a 



V = 



hx (ha — l\) (x — v/y/a) 
h h a 



; h . h h 

hy = —f=v -\ — x —v 

\Ja a a\Ja 



V = 



('2 - fc) 
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The depth for this new surface is given by z = |(u 2 + by 2 ). We can look directly 
at the image of the new surface to see if it is the same as the image of the original 
surface: we have z x = vv x and z y = vv y + by, so 

E(y,z) = \\z x + hz y = v(hv x + l 2 v y ) + l 2 by 

+ h J + hby 



(h - l i) V a 
= hvy/a + hby 



= hy/ay/2z - by 2 + l 2 by, 

which is the same as the image of the original paraboloid. . 

How does this motivating example connect with the critical point theory worked 
out in Chapter 5? It turns out that there is no "good" critical point on the interior 
of the image: our reflectance function does not have a critical point, and indeed 
the image does not have a critical point anywhere either (e.g. no "bad" critical 
points caused by parabolic lines as discussed in Section 5.2.1). Thus there are no 
stable/unstable or saddle manifolds to constrain the possible solution surfaces, and 
so it is quite possible to have a large class of globally consistent solutions. 

Several features of this example will be important in the more general work to 
follow. The simple geometry of the original surface considerably helped the analysis: 
we were able easily to calculate the actual image intensities and find the derivatives. 
The extra function v appears in the denominator of the original vector field and is 
useful in analyzing it. v contains a square root responsible for the finite limit of the 
image intensities as the bounding contour is approached while giving infinite image 
derivatives. We multiplied the whole vector field by v and effectively replaced the z 
coordinate with v to generate a Lipschitz vector field which has a curve of critical 
points along the bounding contour; including the depth coordinate x, we actually 
have a plane of critical points for the new vector field. We will try to follow this 
spirit of analysis for the more general case. 



171 



Chapter 6 



Image Data at the Bounding Contour 



*y 





<t> 



> 




projection fibers 
bounding contour 

Figure 6.3. Surface, bounding contour, and twisted coordinate system. 
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6.3 The General Case: Linearization Approach 

From Whitney's theorem (Golubitsky and Guillemin, 1973) about the generic 
nature of singularities of maps between two dimensional smooth surfaces, we know 
that there are two kinds of singularities generically: a fold and a cusp. The bounding 
contour points on the image are curves of fold points except where they appear to 
end in the image interior: these are cusp points (which we will not be examining). 

Here we look at the image near generic fold points of the surface z = f(x,y). 
We will assume that there is a smooth bounding contour visible in the neighborhood 
of the origin, that the surface does not extend in the negative z direction from the 
bounding contour (the surface runs to the "left" of the bounding contour), and that 
we see the image only of the more positive x sheet of the surface. 

We will generate a new local set of coordinates for space and the image in which 
the surface near and at the bounding contour has a particularly simple coordinate 
representation (this is essentially what Whitney proved to be possible): z = \x 2 . This 
new (x, y, z) coordinate system generates a new (p, q) system for the surface normals 
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at each point (in Section 3.1.2 we saw that the (x,y,z,p,q) coordinates for C(IR ,2) 
works for curvilinear (x, y, z) coordinates as well). In the new (x, y, z, p, q) coordinate 
representation, the reflectance function will no longer be (x,y,z)-invariant, and so 
the characteristic vector field will include terms involving space derivatives. 

The required coordinate change can more intuitively be thought of as a transfor- 
mation of the original surface near the origin from a surface z = f(x, y) to a surface 
z — \x 2 . We assume that our original surface contains the origin and has a fold point 
bounding contour element at the origin. We want to maintain the image projection 
as simple orthographic projection along the x coordinate direction. This restricts 
the transformations we can use to ones which map the null fibres {(A, y, z) for all A} 
of the image projection to themselves. Thus, if <f> is such a transformation with 
<j)(x, y, z) = (x, y, z), we have 

7r 7 (^(a;, y, z)) = (y, z) = tt 7 (^(A, y, z)), 

for all A € 1R, and so in the new coordinate system (x,y,z) the image projection 
still is projection down the x direction. 

We can transform the surface in stages: first, we straighten out the bounding 
contour. Say the bounding contour of the original surface is (x°(s),s,z°(s)) param- 
eterized with respect to the y coordinate (we may need a rotation around the x axis 
to make this possible). Consider the map 

V>(x, y, z) = (x - x°(y),y, z - z°(y)). 

For fixed y and z, this takes points (A,y, z) to points (A — x°(y),y,z — z°(y)), so as 
required it maps fibres of the orthogonal image projection to other such fibres. It also 
maps the curve in space (x°(s),s,y°(s)) to the curve (0,5,0), so we have succeeded 
in straightening out the bounding contour. 

We now want to transform the surface near this straightened bounding contour 
into parabolic form. Let us say that we currently have z = f(x,y), with straight 

173 



Chapter 6 Image Data at the Bounding Contour 

bounding contour (0, y, 0). Let us fix the level y = c, and restrict our attention to the 
(x,z) plane at this level. Suppressing the y coordinate, we seek a transformation <j> : 
(x, z) i — y (<t>i(x, z), <j>2{x, z)) that takes the curve (i.e. the slice of the surface at y = 
c) z = f(x) to the curve z = \x 2 , and that takes projection fibres {(A,z)| for all A G 
IR} to other projection fibres. The latter condition means that <f>2(x,z) = <f>2(z), i-e. 
<f>2 is independent of x. We can take (f>2(x,z) = z for example. We now want 

i(*i(*,/(*))) a = &(/(*)) = /(*) 

on the surface, so <j>i(x,f(x)) = y/2f(x). We can take <j>\{x,z) = (j>i(x) = ^2f(x), 
and our full transformation is: 

4>{x,y,z) = (y/2f(x,y),y,z). 

Looking at the positive x sheet of /, is this transformation smooth at the origin? 
We can look at a power series expansion near the origin: 

f(x, y) = a\x + a 2 y + a 3 x 2 + a±xy + a 5 y 2 H 

Since /(0, y) = and p(0, y) = / x (0, y) = for all y in the region of interest, we must 

have f(x,y) = x 2 g(x, y) for some analytic g(x,y): the coefficients of terms involving 

y % or xy % must be 0. We assume that 03 = <7(0,0) 7^ so that / does have second 

order behavior and g = az + h(x,y), where h(x,y) is first order in x and y. We can 

write 

VfKy) = \x\(a 3 + h(x,y))^ 2 

= \x\( V5J + (l/2y/a3)h(x, y)) + • • • , 
using the power series expansion for the square root. Looking only at the positive 
(visible) sheet, x > 0, y/ f(x,y) will have (one-sided) derivatives of all orders if / is 
smooth. 

Looking at the diffeomorphism </> on the visible sheet of the surface, if 03 is not 
the x component of the vector j^cf> is not zero either, and 





Dcj>=\l4> 1 



1 
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will have full rank at the origin; thus, by the inverse function theorem, <j> is a diffeo- 
morphism of the visible sheet at the origin. 

This transformation will take the visible part of a surface z = f(x,y) with 
bounding contour (0,5,0) and map it to a surface z = \x 2 . By composing the 
first bounding contour straightening map if) with <f>, we have a smooth map that 
takes our original surface with bounding contour (x°(s),s,z°(s)) to the surface z = 
%x , mapping bounding contours to bounding contours, and preserving the image 
projection fibres. If we think of this combined map as a coordinate change, how are 
the (p, q) coordinates affected? 

Here is one way to think of the effect: say we start with some other surface 
z = 9( x ,y) with p — g x and q = g y at some point (x,y,z). Say we transform space 
coordinates with some diffeomorphism $(x,y, z) = (x,y, z). In the new coordinates, 
we will have z = g(x, y) defining the transformed surface, assuming that in the new 
coordinates the Jl direction is not contained in the new surface. We must have 

a Pi Pi 

P = M9i 9 = TF§9- We can write the surface points in the new coordinate system as 
$(x,y,g(x,y)) = $ o (/ x g)(x,y), where I(x,y) = (x,y) is the identity map. Thus 

(x, y, g(x, y)) = ($ x o (/ x g)(x, y), $ 2 o (/ x g)(x, y)), 

where $i gives the first two coordinates and $ 2 gives the last coordinate of $. We 
can solve for g: 

g(x,y) = $ 2 o(Ixg)o ($ 2 o (7 x st)) -1 ^). 

We can now use the chain rule to get expressions for p = g & and q = g^. 

Let us do this for each of the two diffeomorphisms defined above, ip and <f>. We 
have xf>(x, y, z) = (x - x°(y),y, z - z°(y)), so 



V>i o (I x g)(x, y) = (x- x°(y), y) 
Dtyx o (I X g)) = 



1 -x 
1 
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hence 



(ZWi o (/ x g)))-i = [J * 

V>2 o (7 X #)(£, y) = y(z, y) - 2°(y) 

7>£ = £>(V>2 o (/ x g)){D{^ o (7 x ^)))- 1 
(9x,9y) = (flfe, 0**° + 9y - *° ) 

(p,$) = (j>, px°' + q-z '). 



In the second case, we assume we have an image due to a surface z = f(x, y) 
with bounding contour (0,y,0). We have <f>(x,y,z) = (y/2f(x,y),y,z), so 



<£i o (7 x g)(x, y) = {y/2f{x,y\ y) 

f* fv 

D(<j> 1 o(Ixg)) = 



1 



(Z?(^! o (7 x s)))- 1 = 



y/2f(x,y) _L 

Jx Jx 



hence 



1 

foo(I x g)(x,y) = g(x,y) 
D(foo(Ixg)) = (g x ,g y ) 

Dg = D(<f> 2 o (7 x g)){D{<t>i o (7 x g)))- 1 

(9i,9y)=(^Vm^y~), -j^ + to) 

= f py/2f(x,y) -pf y 



The composition of these two transformations gives the actual transformation of 
p and q after the two diffeomorphisms ift and <j>. Notice that a bounding contour 
element (i.e., a point with coordinate p = 0) is mapped to a bounding contour 
element, as expected; q = z° is then taken to q = after both transformations, 
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again as expected. Both transformations of p and q are independent of z, so the 
total transformation is as well. Also, when p = 0, the transformation of q to q is 
independent of the x coordinate; this means that along the projection fibre at the 
bounding contour, if p = the transformation of q back to q depends only on y = y. 
The reflectance function in these new coordinate isi2 = i2o$, where $ = (frotp 
is the total transformation of the coordinates (x, y, z,p, q) to (x, y, z, p, q) induced by 
ift and <j>. The fact that R is space invariant and that $ is z invariant acting on p 
and q means that R is z invariant, but does depend on x and y in general. How- 
ever, at a point (\,y,Q,0,q) in C(IR ,2) projecting to a bounding contour element 
(0, y, 0, 0, q), the observation at the end of the last paragraph means that we will 
have R(0, y, 0, 0, q) = R(X, y, 0, 0, q), so that 

R il (0,y,0,0,q) = R iJ (\,y,0,0,q) 

and 

= ^ 4 (0,y,0,0,?) = ^(A,y,0,0,g). 

We will need this a little further on. 

In this new coordinate system (we will omit the " A "), the surface is given as 
z = ijX 2 . This means that p(x,y) = x, q(x,y) = for this surface, and the image of 
the surface will be E(y,z) = R(x,y, ±x 2 ,x, 0) where now x = \/2z is a function of 
the image coordinates. We can write 

E(y, z) = R(y/2z, y, z, V2z~, 0) 
for the image, and so 

tly = Ry 

_ R x + R p ~ 

where we write R to indicate the evaluation is at (y/2z, y, z, y/2z, 0). After substitut- 
ing into the calculation of the characteristic vector field in our turned coordinates at 

177 



Chapter 6 



Image Data at the Bounding Contour 



X = 



the end of Section 6.1, the characteristic vector field for this image and reflectance 

function is 

Rp 

Rq 

pR p + qR g 

Ry - R y + q ( x r— v + R z — Rl 

We now proceed in the same spirit as in the motivating example: we define a new 
coordinate function v — y/2z, notice that vv = i, replace z by ^v 2 , and multiply 
through by v; we also note that R z — R z = by space-invariance as discussed above: 



R p v 

RqV 



pRp + qRg 

-R x v + p(R x + R p ) 

_v(R y - R y ) + q(R x + R p ) 

where the old functions i?,-, i = x,y, etc. are now evaluated at (x,y, ^v 2 ,p,q) and 
similarly for R{. This new vector field is Lipschitz at all potential bounding contour 
points (x, y, 0, 0, 0), and clearly has a zero there as well (note that the vector field is 
really only defined on a half space v > 0.) 

We can attempt to analyze this bounding contour critical point using the lin- 
earization methods used on "good" critical points in the image interior: the linearized 
version of the vector field looks almost like the vector field itself at the bounding con- 
tour point because many terms are of the form kRi, where k is one of the coordinates 
v,p, q, all of which are zero at the bounding contour; hence, the linear part of such a 
term is just kRi. We can write the linearization at the bounding contour as a matrix 
as we did in the interior critical point analysis: 



X' = 













Rp 

Rq 


—R x 



Ry — R v 






Rp 

Rx + Rp 








Rq 


Rx + Rp 



178 



Chapter 6 



Image Data at the Bounding Contour 



A point projecting to the bounding contour is of the form (x, y, 0,0, 0), so R 
indicates evaluation at (0,?/, 0,0,0). From the analysis two pages back we have 
R y = R y and R x = R x = along the projection fibre through the origin. 1 

Thus, our linearized vector field is given by 



X' = 









Rp 














Rq 

















Rp 


Rq 











Rp 

















Rn 



PA 



For convenience we will work at the origin, so that R p = R p . We can see by inspection 
that the characteristic polynomial of this matrix is A 3 (A — R p ) 2 . We can look for 
eigenvectors associated with the eigenvalues: for A = 0, we see that the matrix 
has rank 3, and so the eigenspace is defective, having only two eigenvectors in it, 
(1,0,0,0, 0) T and (0, 1,0,0, 0) T . For A = R p , we get two eigenvectors for X' from 
looking at the null space of 



Rp 




—Rp 


Rp 

Rq 


















—Rp 



Rp 



Rq 




















we get (l,R q /R p ,l,l,0) T and (l,R q /R p ,l,0,Rp/R q ) T . 

Because the matrix is defective, i.e. its eigenvectors do not form a basis, we need 
to expand our earlier analysis of the allowable two dimensional invariant subspaces. 



We can also argue directly that R x = Rx = along the projection fibre through the origin: we know 
that in the original space invariant problem from which this one comes, there are multiple solutions: we 
just move the correct solution surface back and forth in depth. After transforming all of these solution 
surfaces (or at least patches of them near the origin), these will again be a stack of solution patches 
along the projection direction. Consider the question of finding a possible bounding contour curve in 
space: we seek a curve (x°(s),s,0,0, 0) in C(JR 3 , 2) that is consistent with the image. We must have 
E(s, 0) = R(x (s), s, 0, 0, 0) along the bounding contour. If R x at any point on the origin's projection fibre 
is not zero, then by the implicit function theorem we can solve for i as a function of s in the equation 
E(s, 0) = R(x, s, 0, 0, 0) at least in some neighborhood of the point. This would imply a unique curve x°(s) 
as the bounding contour solution; in fact, we have a dense stack of consistent solution surface patches at 
different depths along the origin's projection fibre; thus there cannot be a unique local bounding contour 
solution, and so we must have Rx = along the origin's (or any bounding contour point's) projection fibre. 
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In the case of a critical point on the interior of the image discussed in Chapter 
5, we were able to focus on a diagonalizable matrix, for which all two-dimensional 
invariant subspaces are spanned by pairs of eigenvectors. Pairs of eigenvectors still 
span invariant spaces in the defective case, but there may be others. To begin this 
analysis, we put the matrix into canonical form by using the basis 

/! = (0,0,1,0,0) T 

f2 = X'f 1 =(R p ,R q ,0,0,0) T 

/ 3 = (0,1,0,0,0) T 

U = (l y R q /R p ,l,l,0) T 

fs = (l,Rg/R p , 1,0, R p /Rq) . 
In this basis, the matrix becomes 



X' = 



X has the characteristic polynomial x 3 (x — R p ) 2 . As this matrix is not diago- 
nalizable, the minimal polynomial (the polynomial of least degree that annhilates 
X') cannot be just the product of the linear factors; by calculation it can be seen 
that the minimal polynomial is x 2 (x — R p ). If W is an invariant subspace of a linear 
transformation T on a finite vector space, then T\w, the restriction of T to W, has 
a minimal polynomial that must divide the minimal polynomial of T on the whole 
vector space (concepts from (Hoffman and Kunzie, 1971)). If W has dimension two, 
then the degree of the minimal polynomial of T\w must be two or less. We can 
classify the invariant two-dimensional subspaces of T by their minimal polynomial, 
and hence by the factors of degree two or less of the minimal polynomial of T = X' . 

Consider the minimal polynomial x. If a two dimensional invariant subspace is to 
have this as the minimal polynomial for T\w, we must have T(W) = 0. Looking at 
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R p 














1 


—Rp 

















—Rp 







































the canonical matrix for T = X', there is only a two dimensional null space for T, and 
so the only invariant two-dimensional subspace with x as the minimal polynomial is 
Span(/2,/3). This is the zero eigenvalue eigenspace, W°. 
Consider the minimal polynomial (x — R p ). We have 



T-R p I = 



and again this matrix only has a null space of dimension two, so the only invariant 
two-dimensional subspace with x - R p as minimal polynomial is Span(/ 4 , / 5 ). This 
is the non-zero eigenvalue eigenspace, W R . 

Consider now the minimal polynomial x 2 . We seek invariant subspaces such that 
T(W) ^ 0, but T 2 (W) = 0. We have 



T = 



In order for the above conditions to be true, there must be some vector v € W such 
that Tv ^ 0, T 2 v = 0. If v = (vi,v 2 , V3,v±, v 5 ) T , then the first condition implies 
at least one of ui,i>4,i7 5 must not be 0; the second condition implies in general 
(assuming we are at a generic point for R so that R p and R q are non-zero) that 
v 4 = v 5 = 0. Thus, we must have f\ € W. Since W is invariant, we must also 
have T/i = / 2 € W as well, and since these are independent, the only T invariant 
two dimensional subspace with minimal polynomial x 2 is Span(/i,/ 2 ). As we shall 
see later, this is not an allowable invariant subspace: it violates an integrability 
constraint. 

For the last potential minimal polynomial x(x — R p ), any pair of eigenvectors, 
one from the zero eigenvalue eigenspace W° and one from the non-zero eigenvalue 
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eigenspace W R , spans a subspace with this as its minimal polynomial. Say u\ € 
W°,u 2 G W R . Then (T - R p ) Ul = -R pUl ^ 0, Tu 2 = R p u 2 ^ 0, but T{T-R p ) Ui = 
0, so W = Span(ui, u 2 ) has minimal polynomial x(x — R p ). Is there any other vector 
v = v\f\ + u\ + u 2 , u\ G W°, u 2 € W R , that could possibly be in an invariant 
subspace with x(x — R p ) as the minimal polynomial for T\wt 

(T - R p )(v) = v 1 f 2 + R p u 2 - RpVifi - R p u\ - R p u 2 

- v\{f 2 - R p fi) - R p ui 
T(T - R p )v = - Vl R p f 2 = 0, 

so we must have v\ = 0. Thus, the only invariant subspaces with x{x—R p ) as minimal 
polynomial are those spanned by a pair of eigenvectors, one from W° and one from 
W R . 

This exhausts the possibilities for two-dimensional invariant subspaces of this 
linear transformation. The subspace W R can be rejected as deriving from a surface in 
space because its space projection is one-dimensional. The zero eigenvalue eigenspace 
W is more of a problem: it expresses the potential complication due to higher order 
solution surfaces, i.e., surfaces that are third and higher order flat along the bounding 
contour. In this analysis we concentrate on truly second order solution surfaces if for 
no other reason than that these are generically the ones we expect to have images 
of; we will not consider W° a possible invariant tangent space for a solution surface. 
The unusual invariant subspace spanned by f\ and f 2 will be disallowed when we 
introduce an integrability 2-tensor constraint which needs to hold at the bounding 
contour. As we shall see, this will leave only subspaces formed from a choice of 
eigenvector from W° and a choice of eigenvector from W R as candidates for solution 
surfaces' tangent spaces. 

As suggested above, there is one more constraint that needs to be taken into 
account. We recall that the linearization is in (x,y,v,p,q) coordinates: we have 
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replaced z by v. Consider the integrability constraint as expressed by the contact 
1-form written with v instead of z: 

= 0(h) = (dz -pdx- qdy)(h) 
= (vdv — pdx — qdy)(h), 

where h is any tangent vector to an allowable solution surface: a two-dimensional 
surface in C(1R , 2) whose orientation part as specified by the p and q coordinates 
is consistent with the space part. Written with the v coordinate, we see that this 
condition is identically satisfied at the bounding contour: any surface (now in the 
x -,Vi v -,P-,q topology) containing the bounding contour will satisfy this condition at 
the bounding contour, since v = p = q = there. This is another reflection of the 
degeneracy of the bounding contour. 

As in the motivating example, another way to view this is to consider a path 
(x,y,v,p,q)(s) in a possible solution surface. The integrability condition applied to 
the tangent of this curve gives 

vv' = px' + qy' , 

and again this provides no constraint on the curve at the bounding contour. However, 
we can consider the limit of this constraint as the bounding contour is approached 
by looking at the derivative of the constraint with respect to s: 

{v'f + vv" = p'x' + px" + q'y' + qy"; 
at the bounding contour where v = p = q = 0, this becomes 

{v'f = p'x' + q'y'. 

In a sense, the degeneracy of the integrability condition at the bounding contour 
allows the next highest order of the integrability condition to appear as the major 
constraint. This integrability condition must be true for all tangent vectors lying in 
a potential integrable solution surface. 
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We can write this condition as a bilinear form or 2-tensor on the tangent space 
at the bounding contour: 

(dp ® dx + dq ® dy - dv <S> di>)(h, h) = 



or as 



h'Ah = 0, 

where A can be thought of as the symmetric matrix representation of this 2-tensor 
in the coordinate system (x,y, v,p, q), 



A = 



10 
1 
0-200 
10 
L0 1 



Which of our invariant subspaces will satisfy this 2-tensor integrability condition? 
/l = (0,0,1,0,0) fails the condition, and so the invariant subspace Span(/i,/2) 
does not correspond to a real surface. Applying the 2-tensor to the basis vectors 
h = (R P ,R g ,0,0,0) T , h = (0,1,0,0,0) T , we have 

/ 2 T A/ 2 = 0-R p + 0-R q -0-0 = 

/ 3 r A/ 3 =0-0 + 0-l-0-0 = 0, 

so both these basis vectors satisfy the 2-tensor integrability constraint. Applying 
the 2-tensor to the basis vectors f± = (1, ft, 1, 1, 0) T , /s = (1, /?, 1, 0, l//?) r where we 
define /? = R p /R g , we have 

/ 4 T A/ 4 = 1- 1 +0-/?- 1 • 1 = 

/ 5 r A/ 5 = 0-l+(l//?)-/?-l-l = 0, 

so both of these basis vectors satisfy the 2-tensor integrability condition as well. 

This leads us to ask what happens to the 2-tensor integrability constraint acting 
on linear combinations of vectors. Assume hi and h 2 are two vectors satisfying 
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the 2-tensor constraint = h^Ah,-. Any scalar multiple Ahi will also satisfy the 
constraint. Also, 

(hi + h 2 ) T A(hi + h 2 ) = h^Ahj + h^Ah 2 + 2hf Ah 2 

= 2hf Ah 2 , 

so that a linear combination of hi and h 2 satisfies the 2-tensor integrability constraint 
if and only if hi and h 2 are A-orthogonal, i.e. hf Ah 2 = 0. If hi and h 2 span a two- 
dimensional space all of whose vectors satisfy the 2-tensor integrability constraint, 
we must have hi and h 2 A-orthogonal to each other. Correspondingly, given hi, we 
can only choose a possible h 2 from the set of A-perpendicular vectors to it, h\ '. 
Since A has full rank, h\ ' will have dimension four. 

Looking at the zero eigenvalue eigenspace W°, we have 

/ 2 r A/ 3 = • + 1R P ■ + • 1 + R q • - 2(0 • 0) = 0, 

so the subspace spanned by / 2 and fa satisfies the integrability constraint. For the 
non-zero eigenvalue eigenspace W R , 

/ 4 T A/ 5 = l-l + l-0 + 0-/? + (1//3) • - 2(1 • 1) = 0, 

so the subspace spanned by e 4 and e$ also satisfies the integrability constraint. Thus, 
both eigenspaces W° and W R satisfy the integrability constraint. 

Consider now the mixed invariant subspace case, with one basis vector, hi chosen 
from W° and another to be chosen from W R to make the resulting subspace obey 
the integrability constraint. Given hi £ W°, we seek an eigenvector from the inter- 
section of the non-zero eigenvalue eigenspace W R with h\ " L \ Since C(IR 3 ,2) only 
has dimension five, the minimum dimension of the intersection of a two dimensional 
subspace and a four dimensional subspace is one. The intersection has dimension 
two if and only if the entire non-zero eigenvalue eigenspace W R is A perpendicular 
to hi. This does not happen in general. 
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To see this, begin with a vector hi from the zero eigenvalue eigenspace W°, 
hi = (a, b, 0, 0, 0) . We seek an eigenvector h 2 of the non-zero eigenvalue eigenspace 
W R . From the coordinates of the basis vectors (1,0,1,1,0) and (l,/3, 1,0, 1//3) for 
W R , h 2 has the form h 2 = (c + d, fi(c + d), c+d, c, d/fi) T where c and d are constants, 
and fi = Rp/Rg as before. We want h 2 and hi to violate the integrability constraint; 
i.e., we want 

^ hf Ah 2 

± • (c + d) + (c + d) ■ a + • /?(c + d) + (d//3) • b - 2(0 • (c + d)) 
-db ± /3(c + d)a. 

We can clearly choose c and d not both zero so that this latter condition is satisfied 
when not both a and 6 are zero. Thus, the non-zero eigenvalue eigenspace W R is not 
completely contained in h^ ' for any non-zero hi , and the intersection W R n h^ A± ^ 
has dimension one: for each choice of hi G W° there will be only a one-dimensional 
subspace of possible h 2 G W R that makes the subspace spanned by hi and h 2 satisfy 
the integrability constraint. 

Note that the space parts of tangent spaces spanned by one vector from W° and 
one from W R are the same; this reflects the fact that the tangent plane in IR 3 is 
fixed by the surface normal which is known at the bounding contour. The difference 
between different eigenvector pairs is in the curvature information contained in the 
orientation part of the eigenvectors. 

For surfaces with non-zero second order behavior (i.e. surfaces that have at most 
one tangent vector from W°), this suggests there is always a one parameter family of 
tangent spaces consistent with the image data at the bounding contour point. This 
is consistent with the results of the motivating example, which indicated solution 
surfaces were determined only up to a choice of depth values along the bounding 
contour. This is similar to the case of an image patch in the interior without critical 
points: by picking different depth curves x°(s) along a fixed curve in the image 
(y(s),z(s)), we can generate a one-parameter family of tangent planes consistent 
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with the image at that point: this family is determined by the constant brightness 
contour of the reflectance map with brightness value equal to that at the image 
point. This is not similar to the case with image patches containing a critical point: 
as discussed in Chapter 5, the very good critical points almost always determine a 
small number of possible solution surfaces consistent with the patch, and therefore 
there are only a finite number of tangent planes consistent with a given point in the 
image patch. 

How does the original surface, z = \x 2 , fit into all of this? This surface is 
parameterized in (x,y,v,p,q) space as 



j : (r,s) 



/r\ 

s 
r 
r 

w 



using v = y/2z. The invariant tangent space in C(]R 3 ,2) corresponding to this 
surface is the subspace spanned by the vectors dj/dr| r=0 ,5=o = (1,0, 1,1,0) T and 
di/ds| r= o,s=o = (0,1,0, 0,0) . This subspace is the same as the invariant subspace 
spanned by the eigenvectors (1,R P /R q , 1, 1, 0) and (0, 1, 0, 0, 0) of X' , so our original 
surface has a tangent plane spanned by eigenvectors of X' , one from W° and one 
from W R , consistent with our analysis. 

The presence of the zero-eigenvalue eigenspaces for X' is important in allowing 
our hypothesis to be possible. As we discussed last chapter, the Grobman-Hartman 
theorem says that a critical point which has no eigenvalues with zero real part has 
the non- linearized local picture diffeomorphic to the linearized picture, as in the very 
good image critical point case: an invariant manifold in the linearized picture corre- 
sponds uniquely to a local invariant manifold of the non-linearized picture. We are 
positing more flexibility than that: we are saying that we may have the lineariza- 
tion x° (0) the same for two different depth curves x°(s) through the same point 
(x (0), 0, 0) and still get different solution surfaces in the general case, matching the 
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motivating example, but contradicting the result of the Grobman-Hartman theo- 
rem. The Grobman-Hartman theorem does not apply here because of the presence 
of zero-eigenvalue eigens paces. 

Note that we have not proved this assertion about solution surfaces near the 
bounding contour; we are saying that the assertion is consistent with the degeneracy 
of the critical point at the bounding contour. Unfortunately, invariant manifolds of 
a linearized system with zero eigenvalues may disappear once the second-order terms 
are put back in. One can construct simple examples (e.g. the system x = x 2 , y = y) 
where the linearized picture does not completely capture the actual behavior or 
the uniqueness of invariant manifolds through a point. There are some results on 
existence of invariant surfaces in the presence of non-hyperbolic critical points, but 
they do not seem useful to our problem. 

Nonetheless, the motivating example suggests that the above interpretation of the 
linearized picture may be correct in general. We will pursue the question in a different 
way using the technique of power series analysis: here too one is hampered by the 
restriction that the power series analysis only tells about analytic solution surfaces, 
and unfortunately when zero- eigenvalue eigenspaces of a critical point are present, 
non-analytic behavior is not uncommon. In addition, a typical difficulty is proving 
convergence of a derived power series: we have not done this in our case. Nonetheless, 
the power series analysis in the next section shows the one-parameter ambiguity 
extending into the third order behavior of analytic solution surfaces, consistent with 
the hypothesis that generic solution surfaces at the bounding contour are determined 
only up to a smooth choice of depths along the bounding contour. 

6.4 General Case: Power Series Analysis 

We begin again with our transformed problem: a surface z = \x 2 with a space- 
varying reflectance function. We seek other surfaces z = f(x,y) that could have 
given rise to the image. We could begin from a general surface and a space-invariant 
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reflectance function, but the additional complexity of the surface structure makes 
the calculations even longer than they are here. 

We can begin by analyzing the restrictions placed on the first few orders of 
behavior by the existence of the image bounding contour (0,5,0); we assume (as 
before) that this contour in the image is due to a curve in space with C(IR 3 ,2) 
coordinates (x°(s),s, 0,0, 0) that is contained in the surface. We will assume we are 
looking at behavior in a neighborhood of the origin. 

We begin by expanding both / and x°: 

f(x,y) = b x x + biy + a x x 2 + 2a 2 xy + a 3 y 2 

+ a^x 3 + 3a 5 x 2 y + 3a 6 xy 2 + a 7 y z -\ 

x°(y) = aiy + a 2 y 2 + a 3 j/ 3 + • • • . 
Assuming both f(x,y) and x°(x,y) are analytic, then we have 

f(x°(y),y) = 

f x (AvU) = o 

f y (x°(y),y) = 

since (x°(y),y,Q) lies on the surface and is the bounding contour for the surface. By 
thinking of / as a function of x only with j/asa parameter, these conditions mean 
that we can write 

f(x,y) = (x-x°(y)) 2 Q(x,y), 

where Q is also analytic, since for fixed y the power series for / in x around x = x°(y) 
has a zero and a zero in the derivative at x = x°(y). If we write 

Q(x, y) = q Q + q lX + q 2 y + qzx 2 + 2q±xy + q 5 y 2 + • • • , 

then we can take the product (x — x°(y)) 2 Q(x,y) term by term as a power series: 
we get 

f(x,y) = q x 2 - 2a 1 q xy + a\q Q y 2 + q x x z + (q 2 - 2a 1 q 1 )x 2 y 

+ («i9i - 2 <*290 - 2a 1 q 2 )xy 2 + (2<*ia;29o + <*i?2)«/ 3 + • • • 
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Taking the derivatives with respect to x and y we get 

fx(x,y) = 2q x - 2aiq y + Zq\x 2 + 2(q 2 - 2a\qi)xy 

+ (<*i<li ~ 2 «29o - 2aiq 2 )y 2 + ■■■ 
fy( x ,y) = -2ai9oa; + 2ajq y + (q 2 - 2a x qi)x 2 

+ 2 ( a i9i - 2cv2?o - 2ot\q 2 )xy + 3(2ai 0:240 + oe 2 q 2 )y 2 -\ . 

The decomposition f(x,y) = (x - x°(y) 2 Q(x,y) takes care of the geometric 
information available to us: that (x°(y),y, 0) is the bounding contour for the surface 
defined parametrically as (x, y, f(x, y)). Now let us add in the image data. We know 
the image is due to the surface z = %x 2 , so, as before, we have 

p(y,z) = x(y,z) = \f2~z 
q(y,z) = 
E(y, z) = R(V2l, y, 2, V2l, 0) 

Replacing z = f(x, y) to find the new surface, the image irradiance equation can be 
written as 

E(y,f(x,y)) = R(x,y,f(x,y),f x (x,y),f y (x,y)) 
R(V2f(x,y), y, f(x, y), y/2f(x,y), 0) = R(x, y, f(x, y), f x (x, y), f y (x, y)). (*) 

The next step is to expand this equation in powers of x and y. 2 

We need an expansion for y/2f(x,y). After gathering terms, we have 



V2/(*,y) = \x - x°(y)\y/2Q(x,y) 

= y/2q~QX - y/2q~QOciy + (l/y/2q^) qi x 2 + (l/\/2go)(«2 - q\ai)xy 
- y/2<to(a 2 + (a iq2 /2q ))y 2 + ■ ■ ■ , 



If we had worked with a general initial surface z = g(x, y) instead of z = \x 2 , we would have had a fair 
bit more trouble writing this equation — effectively, we would be simultaneously solving for x 9 (y,z), where 
for a given y and z denoting a point in the image, x g (y, z) is the value of x such that z = g(x g , y) = f(x, y). 
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using the Taylor series expansion \/a + h = \/a(l + (l/2)(h/a) -\ . We will assume 

that 9o 7^ so that f(x, y) does have second order behavior at the bounding contour. 
As we are interested in the positive x sheet, we have also taken (x — x°(y)) > 0, 
as this defines (locally) the region of the surface f(x,y) which is more x positive. 



As expected, y/f{x,y) on this sheet is first order in x and y; this suggests some 
care in expanding the image irradiance equation (*) in powers of x and y to get the 
coefficients correctly matched. 

For convenience, we will assume that R has an expansion of the form 3 

R(x, y, z, p, q) = R + R x x + Ryy + ■■■ + R q q + R xx x 2 + R xy xy + \- R qq q 2 + ■■■. 

After substituting into the image irradiance equation the series for y/2f(x,y), f(x, y), 
fx( x ->y)-> an d fy{x,y), we can examine the terms that are linear in x and y: 



(R p + R x )(y/2q x - y/2qoaiy) + R y y 

= R x x + R y y + R p (2q x - 2aiq y) + R q (-2a 1 q x + 2a1q y) 
Equating coefficients of x and y we have 

(Rx + Rp)^/2q~ Q = R X + 2(R p q - R q a iqo ) 

-(R x + Rp)y/2%oti + R y = R y - 2(R p aiq - R q a\qo). 

At first glance, this appears to yield two equations in the two unknowns qo and a\; 
however, after canceling the .ftj, terms in the second equation, noting that R x = 0, 
and multiplying the first by -a\ the second equation is seen to be identical to the 
first. We can use the first equation to determine a\ as a function of <fo, but we see 
the one-parameter second order ambiguity in / arising early in the analysis. We have 

R P ( 1 



Rq \ \/2<7o 



For bookkeeping conveyance the coefficients R xx , R xy , ..., R qq are here just coefficients for the terms 
of the expansion. R xx = (l/2)d 2 /dx 2 R, etc. relates these coefficients to actual derivatives of R. 
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[Note that we are assuming in general that R p and Rq are non-zero.] 

The second order terms of the image irradiance equation are more of a book- 
keeping challenge. Rather than write out all the second order terms at once, we will 
focus on the coefficients for each term x 2 ,xy,y 2 individually. 
Coefficients for x 2 : 

o 

(R xx + R xp + R pp )2qo + R z qo -\ — ~==qi 

V2qo 

= R xx + 2R xp qo — 2R xg aiqo + R z qo + 3R p qi + ^R pp ql — A:R m a\ql 
+ Rq{q2 - 2a iqi ) + W qq ct\ql 
Coefficients for xy: 

/—— R 
- 4:(R XX + R xp + Rpp)qQOc\ + (R xy + R yp )y/2q^+ / J— (q2 - qioci) - 2R z a\qo 

V 2 ?o 

= Rxy — 2Rxpai% + IRxqOiiqo + 2R yp q - 2RyqOiiq 

- 2R z a 1 q + 2R p (q 2 - 2c*igi) - %R pp a x ql + 8R pq ajq% 
+ 2R q (a\qi - 2a 2 qo - 2a iq2 ) - %R qq a\ql 

Coefficients for y 2 : 



2(R XX + R xp + Rp P )qoa 1 — (R xy + R yp )y/2qooti + R yy 

+ R z a 2 q - R p ^/2q~ Q (a 2 + ^) 

= Ryy — 2R yp aiqo + 2R yq a 1 qo + R z a 2 qo 

+ Rp{oc\q\ - 2a 2 qo - 2a x q 2 ) + ±R pp ci\ql 

- ARpqalql + 3R q (2aia 2 q + oc 2 q 2 ) + 4R qq afql 
These are also a system of linear equations in q\, q 2 , and a 2 . These last three 
equations together define a linear system of three equations in three unknowns. The 
equations for this system are 
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( R P \ 

\~JW ~ 3Rp + 2aiRq ) qi ~ R i q2 = Cl 

(~ vS + 4Rpai " 2jR « a ?! + ("4= - 2#p + 4i2,ai J ft + 4R g q a 2 = c 2 

- flpafo + (--^ + 2i? p ai - SRgcA q 2 

+ y-R p \/2qo + 2R p q - 6R g aiq ) a 2 = c 3 

We can show that these three equations form a linear system of rank two, so 
that there is another one-parameter ambiguity arising at the second order. First, we 
recall that along the bounding contour, R x = 0. Second, we make use of the first 
order solution for ot\ : 

R P (. 1 



Rq V \/2<7o/ 

We can substitute this in to the equations and simplify. The system of three equations 
becomes 



R p )qi- R g q 2 = ci (*) 



2 

R 



R p 



L 9 



" J*) ( 2 + v8») 51 + R ' ( 2 " vfc) S2 + 4i? "° Q2 " C2 ( "» 



_K(, i \ ! a 2 , / i \ / 2 



^ (' - vfc) ° 2 = <* 



If 9o = 2> tnen 1 - (l/\/2go) = 0, and the last equation is identically zero. In 
this case the three linear homogenous equations (i.e., setting the c, = 0) form a 
dependent, rank 2 system. If 90 ^ j, we can factor out 1 - (l/yflq^) from the last 
equation to get 

R Z P / 1 \ R\( 2 \ 
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We can now show that a linear combination of the homogenous versions of equations 
(**) and (***) will give us a linear multiple of the homogenous version of equation (*), 
showing that the system is dependent in general. We take (R q /R p ) times equation 
(* * *) and add it to equation (**) to get 

Rl( 1 \ / l \ / i 



which is just -{R p /R q )(l - (1/ y/Zqo)) times the homogenous version of equation (*). 
In this case as well the system has rank 2. 

In general the fact that the homogenous system has rank two means either that 
the system of linear equations including the original c,- has no solution, or it has a 
one-dimensional family of solutions. In the Appendix to this chapter we show that 
for this singular system written as Ax = b, we do have 6 in the row space of A; 
thus there is a one-dimensional family of solutions for x, i.e. for q\, q 2 , and a 2 . As 
a result, the third derivatives of an analytic solution surface with non-zero second 
order behavior also possesses a one-parameter ambiguity, again consistent with the 
suggestion that every depth curve x°(s) generates a different solution curve. 

6.5 Bounding Contour Conclusions 

This analysis suggests (up to third order) that an analytic potential solution 
surface / with non-zero second order behavior is determined by the image data 
and the bounding contour in the image up to a choice of depth function x°(s) along 
the bounding contour. This corresponds with our linearized analysis and carries that 
analysis one order further; however, the problems discussed earlier about convergence 
of power series derived this way, and the potential existence of non-analytic solutions 
(or solutions with only third order behavior and above) prevents us from concluding 
with certainty that the general mathematical problem admits solutions unique up to 
a choice of x°(s). It is not clear how to extend the power series analysis beyond the 
third order without a prohibitive notational burden. 
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This analysis of the bounding contour again emphasizes the theoretical impor- 
tance of the critical points on the interior of the image as the determinants of actual 
surface shape. Even if solution surfaces are determined up to a choice of x°(s), it 
still suggests that the bounding contour shading information in isolation is of limited 
value in determining shape. We would expect from this analysis that if subjects are 
shown bounding contours (not closing on themselves) with image data near them, 
subjects should be relatively poor at determining the true solution surface even if 
they know the position of the light source. 

If true, our hypothesis says the ambiguity at the bounding contour is about as 
bad as the ambiguity in the interior of the image on a patch not including a critical 
point: there as well we can determine surface shape up to the choice of a depth curve 
by picking a curve on the image, choosing a depth curve along it, and generating a 
solution surface (locally) using the characteristics. 

However, this does not mean the bounding contour image data is useless: the 
bounding contour may be very important for determining whether a particular re- 
flectance function could have given rise to the image. The image values at the 
bounding contour (in so far as they can be reliably determined) are the result of 
known surface orientations; they provide a curve of data from the reflectance func- 
tion which may quite tightly constrain choices within the class of possible reflectance 
functions. There may also be global effects of having an entire bounding contour 
available. For example, in the degenerate case of an image of a sphere lit from the 
viewing direction, the saddle shaped solutions valid near the critical point at the 
center of the image cannot be extended out to the bounding contour. More work 
needs to be done to understand such global effects in general. 
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A6.1 Linear Dependence of the Power Series Con- 
stants 

If we have a linear equation Ax = b and A is singular, the equation has solutions 
if and only if b is in the range of A. In discussing the bounding contour case, we wound 
up with such a linear equation: the matrix A consisted of the constants attached to 
the unknown power series coefficients q\, <#, c*2, while b is a complicated expression 
containing second derivatives of R and the coefficients qo,cti. It is our aim in this 
appendix to organize the pieces of b and show that it is indeed in the column range 
of A. 

We will collect the various terms in b and factor out the different Rij second 
derivatives for each row of b. This will make b the sum of a set of column vectors, 
each of which is multiplied by some Rij. Note that R xx = R xy = R xq = at 
the bounding contour: we have from the chapter R(\,0,y,0,q) = R(0, 0, y,0, q), so 
R x (\,0, y,0,q) = after taking the derivative with respect to A. Taking further 
derivatives with respect to A, y, and q gives the result. 

We will arrange the columns in two tables: the top row gives the Rij to which 
the column is associated; the bottom three rows give the values of the column of 
entries of b that have Rij as a factor. 
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-ftxa 



R 



xy 



R 



xp 



itxq 



R 



yy 



R 



yp 



2q -l 2a iqo 

-4qoai \fIqH- 1 -Aqoai + 2a\qo -2ajq yf!q^-2qQ 

2q Q a\ -v^o"i + 2a iqQ 



2% a? -A/2^ai 



i? 



yq 



R 



■pp 



pq 







2qo - Ml 



Aa x ql 



„3„2 



2ai9o -4^ Q;i + 80^ -8af#j 8af^ 



..4„2 



-2a^ 2 90 a|-4af^ Aa\q z Q -4a\q% 
We can see that the i? pp , i? p9 and R qq columns are all multiples of the vector 

(1,— 2ai,af) . The Ry q , Ry p , and R xp are all multiples of the vector (0,1,— oti) T . 

Since R xx , Rxy> and R xp are all zero here, these are the only non-zero columns in 

whose span b lies. 

Are the columns (0, 1, -a\) T and (1, -2ai, a\ ) T contained in the column range 

of the matrix A? A has the form 



(-Rpy/^-Rp) -R q q 2 

From the chapter we know that 

ai = Rp L _ 1 

-R? V \/2"<Zo 

so the last column of A is a multiple of (0,1, -a\). We also want to see that 
(l,—2ai,al) T is in the column space of A: we add —l/R q times the second col- 
umn of A to -R p /(4R q q y/2qo) times the last column of A: 



1 + 
Rv (9 ^ \ Rp 

jet v v 7 ^/ #73?7 
"4 v 1 " vfej ( _1 + tI:) + ~^fe V 1 ~ 7237) 

this, using the expression for a\, is just (1, -2a\,a\) T . Thus, the range of A does 
include the vector b as required for there to be any solutions of Ax = b. 



~ 2 k v - 7k) 
-~4v 1_ v^j 
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Chapter 7 



Conclusions and Pointers 



7.1 Conclusions 

An image of a surface is the result of the interaction of a number of processes: 
how points in space are mapped to the image, how light is reflected from a surface 
and concentrated to form the image, the location and nature of light sources in the 
environment, and the geometry of the surface sitting in space. In trying to find out 
the constraints that an image and an assumed reflectance function put on potential 
solution surfaces, we have examined several local and global sources of information 
in the image. 

We have examined three different sorts of small patches in the image. The first 
kind of patch contains neither good critical points nor any part of a bounding contour. 
In this case, the image dynamical system described in Chapter 4 will take any initial 
strip (positions in space with surface orientations) consistent with the image and 
not parallel to characteristic trajectories, and extend it to a solution surface patch 
consistent with the image patch. If we specify a path on the image patch, then 
the ambiguity of the solution surfaces can be summarized as the choice of a depth 
function along this curve; the reflectance function and contact 1-form will provide 
the orientation data needed to make an initial strip, and the characteristic strips will 
draw out a solution from this patch. 
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A second kind of patch we have analyzed contains a piece of the bounding con- 
tour. In this case we suggest (but have not completely proved) that the ambiguity 
of solution surfaces is of the same type as for an image interior patch, except that 
the initial path in the image is the bounding contour. A choice of depths along the 
bounding contour in the patch should lead to characteristics that draw out a solution 
surface consistent with the data on this patch. The bounding contour image data 
does provide extra data about the space-invariant reflectance function that might 
have generated the image: since we know the surface orientations along the bound- 
ing contour, and (theoretically) we know the brightness values along the bounding 
contour, our assumed reflectance map must match these values. 

The third kind of patch is one on the image interior containing a critical point 
of both the image and the reflectance function. Given a general space invariant 
reflectance function, we showed in Chapter 5 that critical points of both the image 
and the reflectance function ("good" critical points) theoretically provide a great deal 
of information about surfaces that may have given rise to the image. We considered 
a good critical point as a critical point in a dynamical system on C(IR 3 ,2), the 
space of all possible two-dimensional tangent spaces at each point in 1R 3 . Possible 
solution surfaces are surfaces that are invariant manifolds of the image dynamical 
system and contain the critical point. For very good critical points (those due to 
local maxima or minima of the reflectance function), this limits the possible solution 
surface patches consistent with the image of a generic surface to at most four. Two 
of these will be the unstable and stable manifolds of the dynamical system, and the 
local surface patch around the critical point can be expanded by following (either 
in positive or negative time) the trajectories away from the critical point. In the 
usual case of a maximum in brightness due to a reflectance function maximum, the 
stable and unstable manifold correspond to the convex and concave concave solution 
surface shapes. The two remaining possible invariant solution surfaces give rise to 
a saddle dynamical system on the solution surface, and so the surface cannot be 
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fully extended just by following the trajectories. In the case of a reflectance function 
maximum, these solution surfaces are saddle shaped in space at the critical point. 
In the case when symmetry is present (for example, a sphere lit from the viewing 
direction) the stable and unstable manifolds (concave and convex solution surfaces) 
will still be unique, but there may be an infinite number of solution surfaces with 
saddle dynamics consistent with the image data near the critical point. 

In summary, for an interior image patch of a generic surface containing a very 
good critical point, the surface in general is theoretically determined up to a finite 
number of possible solutions. For a small image patch from the image interior not 
containing a critical point, we only determine the surface up to an arbitrary space 
curve (locally). For a small image patch containing a piece of the bounding contour, 
the solution surface also appears to be determined only up to an arbitrary space 
curve. 

As described in Chapter 5, we have implemented two related techniques for 
finding the unstable manifold of an image dynamical system in a region near a 
critical point, and have seen that they provide robust solutions for the local shape 
from shading problem. We use the image dynamical flow to deform an initial surface 
over time; the Lambda Lemma says that as t goes to infinity, the deformed surface 
should C 1 approach the unstable manifold near the critical point. In one case we 
used an iterative scheme based on the equations 

v p = X(E X - p x R p - pyR q ) 
v q = \(E y - q x R p - q y R q ) 

where p(x, y) and q(x, y) are the orientation coordinates of the current estimated 
surface, and v p (x,y) and v q (x,y) represent the change in p(x,y) and q(x,y) to get 
new surface orientation coordinates. A is a step size parameter, and can be a function 
of (x, y). In the other case, we used the dynamical system flow to move a mesh of 
points (xi,yj,p(xi,yj),q(xi,yj)) on the solution surface a small amount; we then 
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reset the mesh using a least squares estimate of the new mesh values p(xi, yj) and 
q(xi,yj) at the original (xi,yj) sites. 

Several constraints on a space invariant reflectance function are also available 
from the image dynamical system. First, there are the constraints from the brightness 
at "good" critical points and brightnesses along the bounding contour, where the 
surface orientations are known. Second, if the image is assumed to come from a 
single, smooth surface, then this surface is a smooth invariant manifold of the image 
dynamical system. In order for the image dynamical system defined by the image 
and an assumed reflectance function to have a solution surface, some subset of the 
invariant manifolds (including stable and unstable manifolds of convex and concave 
critical points) must merge rather than intersect in a one-dimensional curve. This 
is unstable behavior for generic dynamical systems, and so may provide a constraint 
on the choice of reflectance functions: although numerical errors will prevent exact 
matching of the invariant manifolds even with exact knowledge of the reflectance 
function, a wrong choice is likely to prevent the invariant manifolds from being even 
close together. 

The modern methods of differential geometry provide a set of tools for reasoning 
about geometry and shape without always being tied to coordinate system expres- 
sions. They allow one to look at all the information available from the image and 
reflectance function in the image irradiance problem and see how properties of the 
dynamical system influence choices of possible solution surfaces; they can provide 
constraints on reflectance function choices as well. The theoretical view of the image 
irradiance problem as a dynamical system also suggests computational approaches 
to finding solution surfaces that are theoretically tractable. 

7.2 Future Work 

As reported in Chapter 5, we have not considered in detail the case of a good 
critical point caused by a saddle in the reflectance function. The critical point in 
this case may no longer be hyperbolic, and the study of the invariant manifolds 
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may be considerably more complicated. With pure imaginary eigenvalues, it may 
be possible for the characteristic trajectories to be closed orbits around the critical 
point. Neither closed orbits nor chaotic critical elements are considered here — it may 
of interest to find and study generic examples where these actually occur in an image 
dynamical system. 

As discussed in previous Chapters, the image dynamical system provides con- 
straints to restrict the choice of reflectance function based on the image data alone. 
Some kind of constraint is necessary: a slide projector on a flat white screen can give 
the impression of any possible three-dimensional scene. A vexing problem in under- 
standing human image understanding is how we tell the difference between painted 
and unpainted surfaces (consider again Figure 1.1). We can clearly be "fooled"; 
indeed, we choose to be fooled when we interpret a photograph as a window onto 
a scene. We also do often have an opinion about whether a surface in an image is 
painted, or is lit in an unusual way. 

What are reasonable reflectance functions to use in the absence of prior informa- 
tion? One might try to work through the physics of image formation as described 
by Horn and put the different constructions used (bidirectional reflectance map, ra- 
diance, and irradiance) into an invariant form connected to the geometry of the 
surfaces and lighting source distributions: for example, image irradiance becomes a 
two-form on the image. It would be interesting if reasonable properties of reflectance 
functions could be phrased in geometrical terms: e.g., symmetries, or other conser- 
vation principles, which might be applied to restrict the class of possible reflectance 
functions. 

Another approach to studying the reflectance function question is to set the 
entire image irradiance problem in an infinite dimensional context instead of the 
finite dimensional one we have used. In this case, we might consider the image to 
be a map from the space of possible surfaces and possible reflectance functions to 
brightness functions on the image. There has been considerable work in looking at 
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fluid mechanics and the calculus of variations from this perspective (Marsden and 
Hughes, 1983); it may be useful to try this here as well. 

The results of perturbing the image dynamical system with changes in reflectance 
functions should be examined: what else can go wrong globally if the wrong re- 
flectance function is used, even if the bounding contour and the critical point bright- 
nesses are matched? If there are no additional global inconsistencies, then how 
different are perturbed solution surfaces from the correct ones? Other global con- 
straints that could be provided by the dynamical system should be explored, e.g., 
the possibility of trajectories folding over or the possibility of self-intersections of the 
solution surface in 1R ; these are not likely to be correct or stable solutions (unless 
perhaps for semi-transparent surfaces) given a smooth image. 

We have focused on critical points and small patches containing the bounding 
contour. A critical point in an image can often be inferred from the brightness 
contours near it even if the critical point itself is obscured. A fuller exploration 
of the role of brightness contour configurations in determining solution surfaces is 
needed. One suspicion is that an annular region which "guarantees" a critical point 
on its missing interior might alone be sufficient to give invariant manifold results 
like those for critical points themselves, or perhaps provide limits on the behavior of 
possible invariant manifolds. 

In footnote 7 of Chapter 5 (p. 109) we mention that the image dynamical system 
can be considered as a Hamiltonian dynamical system; this could be explored either 
on the six dimensional dual space T*]R, 3 where the null-space of a 1-form in T*IR 3 
at a point p can be used to identify a tangent space at p, or in the space invariant case 
this could be explored on the four dimensional reduced space C(JR 3 , 2). As such, the 
behavior of "nearby" Hamiltonian systems as opposed to nearby generic dynamical 
systems is of interest: changing reflectance functions really corresponds to changing 
the Hamiltonian. Other standard features of Hamiltonian systems which we have 
not studied are closed orbits and chaotic orbits (Abraham and Marsden, 1985) as 
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mentioned earlier: under what conditions do these occur in an image dynamical 
system? Probably chaotic critical elements are evidence that we are far from the 
right track in choosing a reflectance function. 

The work reported here examines in detail the image interpretation problem 
defined as exact recovery of the orientation at each point of the image. It is not 
clear that this is what the human perceptual system is concerned with. What are 
properties of a solution surface under reflectance function changes that people think 
remain constant and connected to the shape? How do alterations in the theoretical 
solution surface under changes in the reflectance function consistent with bounding 
contour and critical point data compare with human perceptual accuracy from the 
same image? 

In Chapter 5 we suggested a couple of computational methods for finding invari- 
ant manifolds. From the theoretical description we have given, invariant manifolds 
of the dynamical system containing the critical points are important features to look 
for: better, more efficient ways of finding them would be of interest. We explored 
first order methods; as suggested in Chapter 5, perhaps higher-order methods similar 
to the Runge-Kutta vector field integration methods would give more effective re- 
sults. We used a Connection Machine to explore the methods; other kinds of parallel 
architectures or neural network techniques could be explored. 

The development of a system to actually solve shape from shading using these 
methods would be very interesting, even if the reflectance function must be provided 
beforehand. First, some reliable method of finding the stable and unstable manifolds 
of critical points drawn out as far as possible would be needed, perhaps including 
adaptive determination of the convergence region. Second, some comparison proce- 
dure to decide whether two such manifolds are the same or different would be needed: 
as discussed in Chapter 5, because of computational errors, exact matching cannot 
be expected even if the reflectance function is chosen exactly right. One way to 
accomplish the matching could be to tesselate the image into regions centered over 
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what one hopes are good critical points; for a typical smooth gently curved surface, 
there should be few such points (although in theory there could be many). At each 
critical point, the stable and unstable manifold could be computed; if a region con- 
tains bounding contour elements, this may be used to discard certain solutions if the 
invariant manifold does not come "close enough" to the bounding contour within 
a feasible distance from the observer. Choices from the remaining sets of invariant 
manifolds would have to be stitched together along the boundaries of regions: choices 
that were not "close enough" to each other would be discarded. There is one addi- 
tional complexity: certain regions will not be either the stable or unstable manifold 
because the critical point is actually a saddle point on the surface; the surface solu- 
tion here will have to be drawn out by extending other consistent stable or unstable 
manifolds to include this region. Some measure of goodness of fit of the solution 
would be needed based on how difficult it is to stitch the solution surfaces together. 

With a system like this in hand, one could try to handle unknown reflectance 
functions as well: postulate a reflectance function consistent with the bounding con- 
tour and the critical points and see how bad the solution surface is. Note that 
there is a fair bit of specification already involved: a typical parameterized family 
of reflectance functions might very well be completely determined by the bound- 
ing contour data and the reflectance function maxima data. As suggested above, it 
would be interesting to explore the changes in both the qualitative behavior and the 
quantitative badness (in the sense of trouble stitching together the segments) as a 
result of changing the reflectance functions without changing the critical points or 
the bounding contour data. 

This geometric method of analysis can be extended to include other cues. Perhaps 
the most natural extension is to include time: how do we make use of all the visual 
information available from a black and white television? As discussed in Chapter 
2, one could consider theoretically a time-dependent embedding of a surface S, i : 
S x R — > IR ; depending on the choice of class for i, we could study rigid motions of 
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the object or observer or various kinds of distortions of the object. This generates a 
time-dependent image dynamical system, and it would be interesting to study this 
to see how the time dimension affected features of the dynamical system. 

One could also extend the analysis and methods to the case of flat surfaces or 
zero Gaussian curvature surfaces. An extended region in the image with constant 
brightness is not necessarily the image of a planar surface; it could be a carefully 
shaped and positioned curved surface (with zero Gaussian curvature), one whose 
normals all lie on a single constant brightness contour of a reflectance function. 
((Brooks, 1982) discusses this in some detail.) The image is very unstable: a slight 
shift in position and the image will probably no longer have an open region of constant 
brightness, in contrast to the image of a planar surface which will still give a region 
of constant brightness if it is rigidly shifted. Perhaps by itself this sort of instability 
justifies a visual system labeling as planar an open region of constant brightness? 

Extending this last idea, if issues of "stability" can be connected to the "likeli- 
hood" of a particular interpretation of an image being correct, then it is of interest to 
look at the class of transformations allowed in the definition of stability. If non-rigid 
transformations of surfaces are considered in deciding on stability of an interpre- 
tation, then the plane does not provide a stable constant brightness image either. 
Our experience that rigid motions of objects are important (ubiquitous because they 
correspond to the effects of observer motion) may lead us first to interpret images 
consistent with stability in the class of rigid motions; perhaps there is a hierarchy 
of object transformations, from rigid motion through area preserving distortions to 
general diffeomorphisms, which are involved in analyzing an image or moving im- 
age using stability. A more careful use of catastrophe theory to study the effects of 
transformations on the structure of the image dynamical system may be useful. 

The ideas and methods of modern differential geometry suggest avenues both for 
theoretical and computational research in understanding human and machine vision. 
A visual problem formulated with most of the available information can be examined 

207 



Chapter 7 Conclusions and Pointers 

both globally and locally with these techniques, as we have done with the image 
dynamical system for the shape from shading problem. The coordinate independent 
approach to geometric ideas allows coordinate choices to be made specifically to 
explore a particular feature of the problem. 

There is an entire literature on dynamical systems developed over the last twenty 
years to study features of complicated dynamical systems. The mathematical tools 
were developed to help analyze geometric visualizations of complicated physical prob- 
lems. It is also worthwhile to use these tools to study the principles behind vision 
itself. 
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